advertisement
SGI
®
CloudRack
™
C2 System User’s Guide
Document Number 007-5681-001
COPYRIGHT
© 2010 SGI. All rights reserved; provided portions may be copyright in third parties, as indicated elsewhere herein. No permission is granted to copy, distribute, or create derivative works from the contents of this electronic documentation in any manner, in whole or in part, without the prior written permission of SGI.
LIMITED RIGHTS LEGEND
The software described in this document is "commercial computer software" provided with restricted rights (except as to included open/free source) as specified in the FAR 52.227-19 and/or the DFAR 227.7202, or successive sections. Use beyond license provisions is a violation of worldwide intellectual property laws, treaties and conventions. This document is provided with limited rights as defined in 52.227-14.
The electronic (software) version of this document was developed at private expense; if acquired under an agreement with the USA government or any contractor thereto, it is acquired as “commercial computer software” subject to the provisions of its applicable license agreement, as specified in (a) 48 CFR
12.212 of the FAR; or, if acquired for Department of Defense units, (b) 48 CFR 227-7202 of the DoD FAR Supplement; or sections succeeding thereto.
Contractor/manufacturer is SGI, 46600 Landing Parkway, Fremont, CA 94538.
TRADEMARKS AND ATTRIBUTIONS
Silicon Graphics, SGI, the SGI logo, and CloudRack are trademarks or registered trademarks of Silicon Graphics International Corp. or its subsidiaries in the
United States and/or other countries worldwide.
Athlon, Opteron and Phenom are trademarks or registered trademarks of Advanced Micro Devices Corporation.
InfiniBand is a trademark of the InfiniBand Trade Association.
Intel, Atom and Xeon are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
Linux is a registered trademark of Linus Torvalds.
UNIX is a registered trademark in the United States and other countries, licensed exclusively through X/Open Company, Ltd.
All other trademarks mentioned herein are the property of their respective owners.
Record of Revision
Version
001
Description
July 2010
First release
007-5681-001 iii
Contents
1.
Introduction and Overview
. . . . . . . . . . . . . . . . . . . .
1
ESD and Safety Precautions . . . . . . . . . . . . . . . . . . . .
1
Upgrading or Replacing Components . . . . . . . . . . . . . . . . .
2
Overview . . . . . . . . . . . . . . . . . . . . . . . . .
2
Example Nodeboard Features . . . . . . . . . . . . . . . . . .
3
Processors . . . . . . . . . . . . . . . . . . . . . . . .
3
DIMM Memory . . . . . . . . . . . . . . . . . . . . . .
3
Server Chassis Features . . . . . . . . . . . . . . . . . . . . .
4
System Power Supplies . . . . . . . . . . . . . . . . . . . .
5
Compute Tray Drive Subsystem . . . . . . . . . . . . . . . . .
6
Compute Tray Front I/O Panel . . . . . . . . . . . . . . . . . .
6
Cooling System . . . . . . . . . . . . . . . . . . . . . .
7
Main Enclosure Cooling. . . . . . . . . . . . . . . . . . .
8
Power Supply Cooling . . . . . . . . . . . . . . . . . . .
8
Motherboard Example Diagram . . . . . . . . . . . . . . . . . . .
8
2.
System Operation and Troubleshooting
. . . . . . . . . . . . . . . . 11
Unpacking the System and Choosing an Operating Location . . . . . . . . . . 11
007-5681-001 v
Contents
Placing Rackmounted Systems . . . . . . . . . . . . . . . . . . . 12
Choosing a Setup Location . . . . . . . . . . . . . . . . . . . 12
Rack Precautions . . . . . . . . . . . . . . . . . . . . . . 12
Server Precautions . . . . . . . . . . . . . . . . . . . . . . 12
Rack Operating Considerations . . . . . . . . . . . . . . . . . . 13
Ambient Operating Temperature . . . . . . . . . . . . . . . . 13
Reduced Airflow . . . . . . . . . . . . . . . . . . . . . 13
Mechanical Loading . . . . . . . . . . . . . . . . . . . . 13
Circuit Overloading . . . . . . . . . . . . . . . . . . . . 13
Reliable Ground . . . . . . . . . . . . . . . . . . . . . 14
Providing Power . . . . . . . . . . . . . . . . . . . . . . 14
Troubleshooting the System . . . . . . . . . . . . . . . . . . . . 16
Enclosure Power Supply Status LEDs . . . . . . . . . . . . . . . . 17
Individual Tray LEDs . . . . . . . . . . . . . . . . . . . . . 18
System Fan Failure . . . . . . . . . . . . . . . . . . . . . 19
3.
System Interfaces Overview
. . . . . . . . . . . . . . . . . . . . 21
NICs 1 and 2 (HPC trays only) . . . . . . . . . . . . . . . . . . 21
HDD. . . . . . . . . . . . . . . . . . . . . . . . . . 21
Power . . . . . . . . . . . . . . . . . . . . . . . . . 21
Tray Control Panel Button Examples . . . . . . . . . . . . . . . . 21
Overheat/Fan Fail LED on Back of System. . . . . . . . . . . . . . . 22
4.
HPC Server BIOS Information
. . . . . . . . . . . . . . . . . . . 23
Starting the BIOS Setup Utility . . . . . . . . . . . . . . . . . . . 23
vi 007-5681-001
Contents
How To Change the Configuration Data . . . . . . . . . . . . . . . . 24
Starting the Setup Utility. . . . . . . . . . . . . . . . . . . . 24
Main Setup Screen . . . . . . . . . . . . . . . . . . . . . 24
Advanced Setup Configurations . . . . . . . . . . . . . . . . . 26
BOOT Features . . . . . . . . . . . . . . . . . . . . . 26
Remote Access Configuration . . . . . . . . . . . . . . . . . 37
Hardware Health Monitor . . . . . . . . . . . . . . . . . . 38
IPMI Configuration . . . . . . . . . . . . . . . . . . . . 41
SEL PEF Configuration . . . . . . . . . . . . . . . . . . . 44
The DMI Event Log . . . . . . . . . . . . . . . . . . . . 45
Security Settings . . . . . . . . . . . . . . . . . . . . . 45
Exit Options . . . . . . . . . . . . . . . . . . . . . . 47
BIOS Error Beep Codes . . . . . . . . . . . . . . . . . . . . . 48
BIOS Error Beep Code List . . . . . . . . . . . . . . . . . . . 49
A.
Technical Specifications
. . . . . . . . . . . . . . . . . . . . . 51
Server Specifications and Features . . . . . . . . . . . . . . . . . . 51
Environmental Specifications . . . . . . . . . . . . . . . . . . . 52
007-5681-001 vii
Chapter 1
1.
Introduction and Overview
This chapter provides an overview of your SGI CloudRack C2 workgroup cluster server’s main features.
Operating precautions are provided in this chapter, followed by a general overview of the product.
Before operating your system, familiarize yourself with the safety information in the following section.
ESD and Safety Precautions
Caution: Observe all ESD precautions. Failure to do so can result in damage to the equipment.
Wear an SGI approved wrist strap when you handle an ESD-sensitive device to eliminate possible
ESD damage to equipment. Connect the wrist strap cord directly to earth ground.
Warning:
Before operating or servicing any part of this product, read the safety precautions.
Danger:
Keep fingers and conductive tools away from high-voltage areas. Failure to follow these precautions will result in serious injury or death. The high-voltage areas of the system are indicated with high-voltage warning labels.
!
Caution: Power off the system only after the system software has been shut down in an orderly manner. If you power off the system before you halt the operating system, data may be corrupted.
007-5681-001 1
1: Introduction and Overview
Upgrading or Replacing Components
The SGI CloudRack C2 Component Replacement Guide (P/N 007-5682-00x), describes how to install or replace the following components in an SGI CloudRack C2 cluster server:
• Memory DIMMs
• PCIe cards
• Disk drives
• System fans
• Power supplies
Use the procedures to upgrade system components or replace failing components.
Warning:
If a lithium battery is installed in your system as a soldered part, only SGI qualified service personnel should replace this lithium battery. For a battery of another type, replace it only with the same type or an equivalent type recommended by the battery manufacturer, or an explosion could occur. Discard used batteries according to the manufacturer’s instructions.
Overview
The CloudRack C2 cluster server is comprised of a stand-alone 24U or 42U rack. The 24U rack holds up to 22 compute trays and the 42U rack holds up to 38 trays. The compute trays house serverboards, hard drives, and graphics or I/O options. Stand-alone enclosures are mounted on castors so they can be moved within the server room or compute lab environment. Check with your sales or service representative before loading any operating system on your server not provided by the SGI factory or service organization.
In addition to the compute trays and chassis, various hardware components may be included as part of your CloudRack C2 configuration as listed below:
• SATA Accessories: A minimum of one disk drive (per serverboard) is required for operation.
One (1) internal SATA backplane per compute node. One (1) SATA cable set per compute node. SAS hard drive options are also available.
• Two (2) optional PCI Express x16 riser cards (one per compute serverboard).
2 007-5681-001
Overview
• Rackmount hardware for mounting the enclosure in a half-height or full-height rack.
• One (1) CD containing drivers and utilities plus optional CDs depending on order configuration.
• One or two Gbit Ethernet or (optional) InfiniBand switches are used per enclosure.
Example Nodeboard Features
At the heart of each CloudRack C2 compute tray lies one or more multi-processor based node boards (serverboards). These serverboards may be based on Intel or AMD chipsets, depending on the configuration ordered.
Processors
The cluster server system can support the following minimum/maximum configuration of processor cores:
• 24U system: 8/1056 processor cores
• 42U system: 8/1824 processor cores
The exact type of processors provided with your system depends on the specific configuration you ordered. Check with your sales or service representative for information on processor availability, upgrades and compatibility. The following are examples of types available for the CloudRack C2:
• One or two dual-socket Intel® Xeon® 5600 processor series-based serverboards per tray
• Six single-socket Intel® Atom® processor series-based MicroSlice serverboards per tray
• One or two dual-socket, AMD Opteron™ 4100 or 6100 processor series-based serverboards per tray
Higher-performance compute (HPC) serverboards support two Intel® Xeon quad-core processors
(a total of 4 quad-core processors per compute tray).
DIMM Memory
Memory configuration varies depending on the processor board type ordered with your system tray(s). DIMM population requirements vary from one per serverboard up to 12 per serverboard, using DIMM speeds from 667 MHz to 1333 MHz.
007-5681-001 3
1: Introduction and Overview
The HPC serverboards in each compute tray have up to six 240-pin DIMM sockets that can support up to 48 GB of registered ECC DDR3-1333/1066/800 SDRAM (96 GB total for the compute tray). As noted, this capacity can vary based on which compute trays your SGI
CloudRack C2 is using. In all configurations a minimum of 3GB of DIMM per core installed is recommended for optimum performance.
The SGI CloudRack C2 Component Replacement Guide provides details on replacing or installing
DIMMs in the system.
Server Chassis Features
The following sections provide a general outline of the main features of the CloudRack C2 system chassis (enclosure).
Figure 1-1 shows the front of the 24U CloudRack C2 server.
4
Figure 1-1
24U CloudRack C2 Server
007-5681-001
Figure 1-2 shows the front of the 42U CloudRack C2 server.
Server Chassis Features
Figure 1-2
42U CloudRack C2 Server
System Power Supplies
Your CloudRack C2 server uses up to 12 high-efficiency power supplies; each supply provides up to 240 Amps of 12-volt power to the system. The power supplies require a 200-240 Volt power cable input with a 47 Hz to 63 Hz operational range.
007-5681-001 5
6
1: Introduction and Overview
!
Caution: The chassis power supply cord is used as the main disconnect device. Ensure that the socket outlet is located or installed near the equipment and is easily accessible.
Compute Tray Drive Subsystem
Each drive tray in the CloudRack C2 chassis was designed to support up to eight SATA or SAS hard disk drives. Specific optional trays may be available with up to four drives configured for each serverboard in the tray. Note that the standard trays provided in the CloudRack C2 only support
SATA disk drives. However, by adding an optional SAS HBA certain versions of the compute trays can support SAS disk drives. The SAS HBA will also support RAID 0 and 1 on both SAS and
SATA disk drives. One HBA is added to each node of the compute tray.
Compute Tray Front I/O Panel
Each system compute tray installs via the front of the CloudRack C2 chassis. Figure 1-3 shows an example compute tray front panel. Its I/O panel typically provides COM ports, status LEDs and
Gb Ethernet ports.
Power
LED
Select server up
Current server LED
Power umbilical connector
Select server down
Power status
LED
HDD
LED
Power button
Custom programmable
LED
Figure 1-3
Compute Tray Front Panel Example
007-5681-001
Server Chassis Features
Cooling System
The server chassis has a staggered cooling design that features an enclosed 42-fan primary air shroud and up to 12 additional power supply fans. A fan speed control setting sensor in the main fan assemblies can increase or decrease fan speed based on ambient temperature.
O
007-5681-001
Figure 1-4
CloudRack C2 Cooling Fans on Rear of Enclosure
7
1: Introduction and Overview
Main Enclosure Cooling
The primary cooling for the enclosure is an array of seven sets of six 5-inch (120-mm) fans.
Figure 1-4 on page 7 shows the rear of the system with the door removed. These six-fan assemblies (42 fans total) provide the primary cooling for the system compute trays and their supported optional I/O and drive hardware.
Power Supply Cooling
The smaller cooling fans (seen in Figure 1-4 on page 7) provide air for the system power supplies.
These 60-mm (2.4-inch) fans are each an integral part of the power supply unit. Each system power supply (located in the rightmost section of the enclosure) uses a separate, independent cooling fan. See the SGI CloudRack C2 Component Replacement Guide for information on replacing a failed power supply.
Motherboard Example Diagram
The SGI CloudRack C2 can be configured with trays that offer higher processor counts or higher compute power. Figure 1-5 on page 9 shows an example diagram of a high-performance computing (HPC) motherboard based on Intel chip sets.
8 007-5681-001
007-5681-001
Motherboard Example Diagram
A
A
A
1
2
3
B
B
B
1
2
3
CPU#1 CPU#2
1
2
3
B
B
B
1
2
3
A
A
A
QSFP
MT25408
Connect-X IB
PCI-E Gen2/DDR or QDR
PCI-E x16
WBD
Port1
Ports
3,4
Ports
5,6
Intel
5520
IOH36D
Port0
Ports 2,1
Ports
7,8,9,10
ESI CLINK
Kawela
RJ45 RJ45
SST25
VF016
SPI
PE5 DMI CLINK
PE4-1
ICH10R
BMC/VGA
PCI
LPC
SATA
SATA #1
SATA #2
SATA #3
SATA #4
SATA #5
SATA #6
LPCIO W83527
ACPI
KBC
Figure 1-5
VGA
RTL8201N PHY
Dedicate LAN
HPC Motherboard Block Diagram Example
9
Chapter 2
2.
System Operation and Troubleshooting
The first half of this chapter describes the basic steps needed to get your SGI CloudRack C2 up and running. Following these steps in the order given should enable you to have the system operational within a minimum amount of time. The second half of this chapter provides you with some basic troubleshooting advice. Use these sections to eliminate simple problems or obtain information that may be needed by your service provider.
Unpacking the System and Choosing an Operating Location
You should inspect the box the system was shipped in and note if it was damaged in any way. If the server itself shows damage you should file a damage claim with the carrier who delivered it.
When you decide on a suitable location for the system, it should be situated in a clean, dust-free area that is well ventilated. Avoid areas where heat, electrical noise and electromagnetic fields are generated. You will also need it placed near a dedicated 200-240 Volt grounded single-phase
(L6-30) power outlet or 3-phase power outlet.
The CloudRack C2 is designed to fit into a computer lab or server room environment. Take care to maintain the following operating conditions:
• The system should have a six-inch (15 cm) minimum top air clearance.
• The system should be protected from harsh environments that produce excessive vibration and heat.
• The system should be kept in a clean, dust-free location to reduce maintenance problems.
• Available power must be rated for large computer operation (30 amps at 200-240 Volts).
007-5681-001 11
2: System Operation and Troubleshooting
Placing Rackmounted Systems
Be sure to read the “Rack Precautions” on page 12 if you are having the system installed on site.
Choosing a Setup Location
Leave enough clearance in front of the rack to enable opening the front door completely
~48 inches (1.2 meters).
Leave sufficient clearance in the back of the rack to allow for adequate airflow and ease in servicing.
Rack-mounted systems are generally placed in a Restricted Access Location (dedicated equipment rooms, service closets and the like).
Rack Precautions
• Ensure that the leveling jacks on the bottom of the rack are fully extended to the floor with the full weight of the rack resting on them.
• The enclosure should be installed in the lowest part of the rack possible.
• In a tall rack installation, stabilizers should be attached to the rack if available.
• Always make sure the rack is stable before connecting power to the rack and internal inclosure.
• Always keep the rack's front door and all panels and components on the servers closed when not servicing to maintain proper cooling.
• When moving a rack with pivoting casters, push the rack from front to back. Pushing from the side could destabilize the rack if a caster encounters a floor irregularity/dropoff.
Server Precautions
Review the electrical and general safety precautions in Chapter 1.
For extra protection, use a regulating uninterruptible power supply (UPS) to protect the cluster server’s power supplies from power surges, voltage spikes and to keep your system operating in case of a power failure. This is an optional device not provided by SGI with your system.
12 007-5681-001
Placing Rackmounted Systems
Service personnel should always allow the hot plug disk drives and power supply modules to cool before touching them. To maintain proper cooling, always keep the enclosure doors closed when it is not being serviced.
Make sure all power and data cables are properly connected and not blocking the enclosure airflow.
Rack Operating Considerations
Use the guidelines in the following subsections to properly use and maintain a server in a rack.
Ambient Operating Temperature
If installed in a closed or multi-unit rack assembly, the ambient operating temperature of the rack environment may be greater than the ambient temperature of the room. Therefore, consideration should be given to installing the equipment in an environment compatible with the manufacturer’s maximum rated ambient temperature.
Reduced Airflow
Equipment should be mounted into a rack so that the amount of airflow required for safe operation is not compromised.
Mechanical Loading
Equipment should be mounted into a rack so that a hazardous condition does not arise due to uneven mechanical loading.
Circuit Overloading
Consideration should be given to the connection of the equipment to the power supply circuitry and the effect that any possible overloading of circuits might have on over-current protection and power supply wiring. Appropriate consideration of equipment nameplate ratings should be used when addressing this concern.
007-5681-001 13
2: System Operation and Troubleshooting
Reliable Ground
A reliable ground for the system must be maintained at all times. To ensure this, the rack itself should be grounded. Particular attention should be given to power supply connections other than the direct connections to the branch circuit (i.e. the use of power strips, etc.). Note that all power and data cables should be routed in such a way that they do not block the airflow generated by the enclosure fans.
Pin 2
(neutral)
Socket 2
(neutral)
Ground pin
Pin 1 (line)
Power cord connector
Ground socket
Socket 1
(line)
Receptacle
Figure 2-1
Single-Phase Power Plug Example
Providing Power
Plug the power cord from the server power supply array into a rack power distribution unit (PDU) or high-quality power source (see example in Figure 2-1) that offers protection from electrical noise and power surges.
For higher availability it is recommended that you use an optional uninterruptible power supply
(UPS) with the cluster server (not provided by SGI).
14 007-5681-001
Placing Rackmounted Systems
Finally, press the enable power switch to On (|) on the rear of the enclosure, see Figure 2-2 for an example.
PS
O
OK
PS
I
O
Power switch
007-5681-001
Figure 2-2
Enable Enclosure Power Switch Location
15
2: System Operation and Troubleshooting
Troubleshooting the System
The following table lists recommended actions for problems that can occur. To solve problems that are not listed in this table or in another section of this chapter, contact your SGI system support engineer (SSE) or other approved service provider.
Table 2-1
Troubleshooting Chart
Problem Description
The system will not power on.
Recommended Action
Ensure that the power cord of the enclosure is seated properly in the power receptacle.
Ensure the enclosure’s power supply switch is set to On (|).
Did you push the power button on the “head node” compute tray as well as the “compute node” trays?
If the power cord is plugged in and all the power switches are on, contact your support organization or SSE.
An individual compute/memory tray will not power on.
View the LED outputs on the front of the tray, (see also
Figure 2-4 on page 18).
If the LEDs are not lit, contact your SSE.
The system will not boot the operating system. Contact your SSE.
The PWR LED of a populated PCI slot in a tray is not illuminated.
The Fault LED of a populated PCI slot is illuminated (on).
The fault LED of a hard disk drive is on.
Refer to the
SGI CloudRack C2 Component
Replacement Guide
and reseat the PCI card.
Refer to the
SGI CloudRack C2 Component
Replacement Guide
and reseat the PCI card. If the fault
LED remains on, replace the PCI card.
Refer to the
SGI CloudRack C2 Component
Replacement Guide
and replace the disk drive.
16 007-5681-001
Troubleshooting the System
Enclosure Power Supply Status LEDs
Each power supply installed in a CloudRack C2 enclosure has two (green/amber) status LEDs, see
Figure 2-3 for an example. These LEDs can be viewed to determine if a problem with the supply exists. If the supply status LED indicates a malfunction, a service technician should replace it as soon as practicable. Service information is available in the SGI CloudRack C2 Component
Replacement Guide.
Service required LED
System running LED
007-5681-001
Figure 2-3
System Power Supply Status LED Locations
The LEDs will either light green or amber (yellow), or flash green or yellow to indicate the status of the individual supply. See Table 2-2 for a complete list.
Table 2-2
Power Supply LED States
Power supply status
No AC power to the supply
Power supply has failed
Power supply problem warning
Green LED
Off
Off
Off
AC available to supply (standby) but system is off
Blinking
Power supply on (system on) On
Amber LED
Off
On
Blinking
Off
Off
17
2: System Operation and Troubleshooting
Individual Tray LEDs
Each server tray installed in a CloudRack C2 enclosure has LED indicators to show the operational status of the tray. The LEDs are located on the front section of the tray and are visible when the front cover of the enclosure is open, see the example in Figure 2-4. The functions of the
LED status lights on the example tray shown are as follows:
• RED- Power identifier - this red LED shows that power is being supplied to the tray.
• GREEN- Power OK - this green LED lights when the correct power levels are present on the processor(s) and other components used on the tray.
• BLUE- HDD status on the tray. This blue LED lights when functionality is established on the tray’s hard disk drive or solid state disk drive.
• WHITE - this custom programmable LED is used for reporting specific tray related status that may differ depending on the build-to-order configuration you purchased.
• NUMERIC - this single-numeral LED shows the server number assigned to the tray.
Power
LED
Select server up
Current server LED
Power umbilical connector
Select server down
Power status
LED
HDD
LED
Power button
Custom programmable
LED
Figure 2-4
Example Compute Tray Status LEDs and Switches
18 007-5681-001
Troubleshooting the System
System Fan Failure
The 42 fans that cool the main enclosure and compute trays are arranged in seven replaceable assemblies on the back of the unit. If a fan in one of the assemblies fails for any reason, a trouble indicator light comes on, see Figure 2-5. Depending on your system’s configuration, you may also receive a console warning. Refer to the SGI CloudRack C2 Component Replacement Guide for information on replacing a fan assembly. The system can continue to run with a single fan failure, but it should be replaced as soon as possible.
Fan failure LED
007-5681-001
Figure 2-5
Fan Failure LED Location Example
19
Chapter 3
3.
System Interfaces Overview
This chapter provides a brief overview of the standard and optional interfaces available on your
SGI CloudRack C2 system. The components of the system are described and illustrated.
NICs 1 and 2 (HPC trays only)
• NIC1 - Indicates network activity on LAN1 when flashing.
• NIC2 - Indicates network activity on LAN2 when flashing
HDD
Channel activity for the hard disk drives. This light indicates disk drive activity on the unit when flashing. The drive LED can be seen through the perforated tray front panel in most system configurations.
Power
The LED indicates power is being supplied to the system's power supply unit. This LED should normally be illuminated when the system is operating. Each power supply also has a trouble light indicating a malfunction in either the supply or its cooling fan, see also “Enclosure Power Supply
Status LEDs” in Chapter 2.
Tray Control Panel Button Examples
Some compute trays use push-buttons located on the front panel, a power button and server selection buttons are available on specific configurations.
• Power - This is the main power button, which is used to apply or turn off the power to the compute tray. Pushing this button removes the main power but keeps standby power supplied to the tray, see Figure 3-1.
007-5681-001 21
3: System Interfaces Overview
• Select Server - These buttons select the server number of the individual compute tray.
Power
LED
Select server up
Current server LED
Power umbilical connector
Select server down
Power status
LED
HDD
LED
Power button
Custom programmable
LED
Figure 3-1
Power and Server Select Button Examples on a Compute Tray
Overheat/Fan Fail LED on Back of System
When the red LED on a rear fan assembly flashes, it indicates a fan failure. When on continuously it indicates an error condition, which may be caused by a fan failure, an obstruction of the airflow in the system or the ambient room temperature being too warm. Check the routing of the cables and make sure all fans are present and operating normally.
22 007-5681-001
Chapter 4
4.
HPC Server BIOS Information
This chapter describes the functions and features of the AMI BIOS Setup Utility for the SGI HPC version of the CloudRack C2 cluster server. The AMI ROM BIOS is stored in a Flash EEPROM and can be updated as needed; check with your SGI sales or service representative for information on updates. This chapter covers basic navigation of the AMI BIOS Setup Utility screens.
Important: This BIOS information is applicable to Intel Xeon based cluster servers only.
Starting the BIOS Setup Utility
To enter the AMI BIOS Setup Utility screens, press the
<Delete>
key while the system is booting up.
Note: In most cases, the
<Delete> key is used to launch the AMI BIOS setup screen. There are a few cases when other keys are used, such as
<F1>
,
<F2>
, etc.
Each main BIOS menu option is described in this manual. The Main BIOS setup menu screen has two main frames. The left frame displays all the options that can be configured. Note that grayed-out options cannot be configured. Options in blue can be configured by the user. The right frame displays the key legend. Above the key legend is an area reserved for a text message. When an option is selected in the left frame, it is highlighted in white. Often a text message will accompany it. Note that the AMI BIOS has default text messages built in. SGI retains the option to include, omit, or change any of these text messages.
The AMI BIOS Setup Utility uses a key-based navigation system called "hot keys". Most of the
AMI BIOS setup utility "hot keys" can be used at any time during the setup navigation process.
These keys include
<F1>
,
<F10>
,
<Enter>
,
<ESC>
, arrow keys, etc.
007-5681-001 23
4: HPC Server BIOS Information
Note: Options printed in Bold are default settings.
How To Change the Configuration Data
The configuration data that determines the system parameters may be changed by entering the
AMI BIOS Setup utility. This Setup utility can be accessed by pressing
<Del> at the appropriate time during system boot.
Starting the Setup Utility
Normally, the only visible Power-On Self-Test (POST) routine is the memory test. As the memory is being tested, press the
<Delete> key to enter the main menu of the AMI BIOS Setup Utility.
From the main menu, you can access the other setup screens. An AMI BIOS identification string is displayed at the left bottom corner of the screen below the copyright message.
Warning:
Do not upgrade the BIOS unless your system has a BIOS-related issue and you have instructions to do the upgrade from your SGI sales or service representative. Flashing the wrong BIOS can cause irreparable damage to the system and may void your warranty.
Your warranty may not cover direct, indirect, special, incidental, or consequential damages arising from a BIOS update. If you have to update the BIOS, do not shut down or reset the system while the BIOS is updating. This is to avoid possible boot failure.
Main Setup Screen
When you first enter the AMI BIOS Setup Utility, you will enter the Main setup screen. You can always return to the Main setup screen by selecting the Main tab on the top of the screen. The Main
BIOS Setup screen has information similar to that shown below.
System Overview:
The following BIOS information will be displayed:
System Time/System Date:
24 007-5681-001
007-5681-001
How To Change the Configuration Data
Use this option to change the system time and date. Highlight System Time or System Date using the arrow keys. Enter new values through the keyboard. Press the <Tab> key or the arrow keys to move between fields. The date must be entered in Day MM/DD/YY format. The time is entered in HH:MM:SS format. (Note that the time is in the 24-hour format. For example, 5:30 P.M.
appears as 17:30:00.)
BIOS Build Version:
This item displays the BIOS revision used in your system.
BIOS Build Date:
This item displays the date when this BIOS was completed.
AMI BIOS Core Version:
This item displays the revision number of the AMI BIOS Core upon which your BIOS was built.
Processor:
The AMI BIOS will automatically display the status of the processor used in your system:
CPU Type:
This item displays the type of CPU used in the system motherboard.
Speed:
This item displays the speed of the CPU detected by the BIOS.
Physical Count:
This item displays the number of processors installed in your system as detected by the BIOS.
Logical Count:
This item displays the number of CPU Cores installed in your system as detected by the BIOS.
Micro_code Revision:
This item displays the revision number of the BIOS Micro_code used in your system.
25
4: HPC Server BIOS Information
System Memory:
This displays the size of memory available in the system:
Size:
This item displays the memory size detected by the BIOS.
Advanced Setup Configurations
Use the arrow keys to select Boot Setup and hit <Enter> to access the submenu items:
BOOT Features
Quick Boot
If enabled, this option will skip certain tests during POST to reduce the time needed for system boot. The options are Enabled and Disabled.
Quiet Boot
This option allows the bootup screen options to be modified between POST messages or the OEM logo. Select Disabled to display the POST messages. Select Enabled to display the OEM logo instead of the normal POST messages. The options are Enabled and Disabled.
Add-On ROM Display Mode
This sets the display mode for Option ROM. The options are Force BIOS and Keep Current.
Bootup Num-Lock
This feature selects the Power-on state for Numlock key. The options are Off and On.
Wait For 'F1' If Error
This forces the system to wait until the 'F1' key is pressed if an error occurs. The options are
Disabled and Enabled.
Hit 'Del' Message Display
26 007-5681-001
007-5681-001
How To Change the Configuration Data
This feature displays "Press DEL to run Setup" during POST. The options are Enabled and
Disabled.
Interrupt 19 Capture
Interrupt 19 is the software interrupt that handles the boot disk function. When this item is set to
Enabled, the ROM BIOS of the host adaptors will "capture" Interrupt 19 at boot and allow the drives that are attached to these host adaptors to function as bootable disks. If this item is set to
Disabled, the ROM BIOS of the host adaptors will not capture Interrupt 19, and the drives attached to these adaptors will not function as bootable devices. The options are Enabled and Disabled.
Power Configuration
Power Button Function
If set to Instant_Off, the system will power off immediately as soon as the user hits the power button. If set to 4_Second_Override, the system will power off when the user presses the power button for 4 seconds or longer. The options are Instant_Off and 4_Second_Override.
Restore on AC Power Loss
Use this feature to set the power state after a power outage. Select Power-Off for the system power to remain off after a power loss. Select Power-On for the system power to be turned on after a power loss. Select Last State to allow the system to resume its last state before a power loss. The options are Power-On, Power-Off and Last State.
Watch Dog Timer
If enabled, the Watch Dog Timer will allow the system to reboot when it is inactive for more than
5 minutes. The options are Enabled and Disabled.
Processor and Clock Options
This submenu allows the user to configure the Processor and Clock settings.
Ratio CMOS Setting
This option allows the user to set the ratio between the CPU Core Clock and the FSB Frequency.
(Note: if an invalid ratio is entered, the AMI BIOS will restore the setting to the previous state.)
The default setting depends on the type of CPU installed on the motherboard. The default setting
27
4: HPC Server BIOS Information
28
for the CPU installed in your motherboard is [18]. Press "+" or "-" on your keyboard to change this value.
C1E Support
Select Enabled to use the feature of Enhanced Halt State. C1E significantly reduces the CPU's power consumption by reducing the CPU's clock cycle and voltage during a "Halt State." The options are Disabled and Enabled.
Hardware Prefetcher
(Available when supported by the CPU)
If set to Enabled, the hardware pre-fetcher will pre-fetch streams of data and instructions from the main memory to the L2 cache in a forward or backward manner to improve CPU performance.
The options are Disabled and Enabled.
Adjacent Cache Line Prefetch
(Available when supported by the CPU)
The CPU fetches the cache line for 64 bytes if this option is set to Disabled. The CPU fetches both cache lines for 128 bytes as comprised if Enabled.
Intel® Virtualization Technology
(Available when supported by the CPU)
Select Enabled to use the feature of Virtualization Technology to allow one platform to run multiple operating systems and applications in independent partitions, creating multiple "virtual" systems in one physical computer. The options are Enabled and Disabled.
Note: Check with your SGI sales or support representative for information before trying to use the Virtualization option. If there is any change to this setting, you will need to power off and restart the system for the change to take effect.
Execute-Disable Bit Capability (Available when supported by the OS and the CPU)
Set to Enabled to enable the Execute Disable Bit which will allow the processor to designate areas in the system memory where an application code can execute and where it cannot, thus preventing a worm or a virus from flooding illegal codes to overwhelm the processor or damage the system during an attack. The default is Enabled. (Check with your SGI sales or service representative for more information before modifying this setting.)
Simultaneous Multi-Threading (Available when supported by the CPU)
007-5681-001
007-5681-001
How To Change the Configuration Data
Set to Enabled to use the Simultaneous Multi-Threading Technology, which will result in increased CPU performance. The options are Disabled and Enabled.
Active Processor Cores
Set to Enabled to use a processor's Second Core and beyond. (Please refer to Intel's web site for more information.) The options are All, 1 and 2.
Intel® EIST Technology
EIST (Enhanced Intel SpeedStep Technology) allows the system to automatically adjust processor voltage and core frequency in an effort to reduce power consumption and heat dissipation. Check with your SGI sales or service representative for more information on using this option in SGI systems and clusters. The options are Disable (Disable GV3) and Enable (Enable GV3).
Intel® TurboMode Technology
Select Enabled to use the Turbo Mode to boost system performance. The options are Enabled and
Disabled.
Intel® C-STATE Tech
If enabled, C-State is set by the system automatically to either C2, C3 or C4 state. The options are
Disabled and Enabled.
C-State package limit setting
If set to Auto, the AMI BIOS will automatically set the limit on the C-State package register. The options are Auto, C1, C3, C6 and C7.
C1 Auto Demotion
When enabled, the CPU will conditionally demote C3, C6 or C7 requests to C1 based on un-core auto-demote information. The options are Disabled and Enabled.
C3 Auto Demotion
When enabled, the CPU will conditionally demote C6 or C7 requests to C3 based on un-core auto-demote information. The options are Disabled and Enabled.
Clock Spread Spectrum
29
4: HPC Server BIOS Information
Select Enable to use the feature of Clock Spectrum, which will allow the BIOS to monitor and attempt to reduce the level of Electromagnetic Interference caused by the components whenever needed. The options are Disabled and Enabled.
Advanced Chipset Control
The items included in the Advanced Settings submenu are listed below:
CPU Bridge ConfigurationQPI Links Speed
This feature selects QPI's data transfer speed. The options are Slow-mode, and Full Speed.
QPI Frequency
This selects the desired QPI frequency. The options are Auto, 4.800 GT, 5.866GT, 6.400 GT.
QPI L0s and L1
This enables the QPI power state to low power. L0s and L1 are automatically selected by the motherboard. The options are Disabled and Enabled.
Memory Frequency
This feature forces a DDR3 frequency slower than what the system has detected. The available options are Auto, Force DDR-800, Force DDR-1066, Force DDR-1333.
Memory Mode
The options are Independent, Channel Mirror, Lockstep and Sparing.
• Independent - All DIMMs are available to the operating system.
• Channel Mirror - The motherboard maintains two identical (redundant) copies of all data in memory.
• Lockstep - The motherboard uses two areas of memory to run the same set of operations in parallel.
• Sparing - A preset threshold of correctable errors is used to trigger fail-over. The spare memory is put online and used as active memory in place of the failed memory.
Demand Scrubbing
30 007-5681-001
007-5681-001
How To Change the Configuration Data
A memory error-correction scheme where the Processor writes corrected data back into the memory block from where it was read by the Processor. The options are Enabled and Disabled.
Patrol Scrubbing
A memory error-correction scheme that works in the background looking for and correcting resident errors. The options are Enabled and Disabled.
Throttling - Closed Loop/Throttling - Open Loop
Throttling improves reliability and reduces power in the processor by automatic voltage control during processor idle states. Available options are Disabled and Enabled. If Enabled, the following items will appear:
Hysteresis Temperature
(For the Closed Loop only)
Temperature Hysteresis is the temperature lag (in degrees Celsius) after the set DIMM temperature threshold is reached before Closed Loop Throttling begins. The options are Disabled,
1.5
o C, 3.0
o C, and 6.0
o C.
Guardband Temperature
(For the Closed Loop only)
This is the temperature which applies to the DIMM temperature threshold. Each step is in 0.5
o C increment. The default is [006]. Press "+" or "-" on your keyboard to change this value.
Inlet Temperature
This is the temperature detected at the chassis inlet. Each step is in 0.5
o C increment. The default is [070]. Press "+" or "-" on your keyboard to change this value.
Temperature Rise
This is the temperature rise to the DIMM thermal zone. Each step is in 0.5
o C increment. The default is [020]. Press "+" or "-" on your keyboard to change this value.
A ir Flow
This is the airflow speed to the DIMM modules. Each step is one mm/sec. The default is [1500].
Press "+" or "-" on your keyboard to change this value.
Altitude
31
4: HPC Server BIOS Information
32
This feature defines how many meters above or below sea level the system is located. The options are Sea Level or Below, 1~300, 301~600, 601~900, 901~1200, 1201~1500, 1501~1800,
1801~2100, 2101~2400, 2401~2700, 2701~3000.
DIMM Pitch
This is the physical space between each DIMM module. Each step is in 1/1000 of an inch. The default is [400]. Press "+" or "-" on your keyboard to change this value.
North Bridge Configuration
This feature allows the user to configure the settings for the Intel North Bridge chip.
Crystal Beach/DMA
This feature works with the Intel I/O AT (Acceleration Technology) to accelerate the performance of TOE devices. (Note: A TOE device is a specialized, dedicated processor that is installed on an add-on card or a network card to handle some or all packet processing of this add-on card.) When this feature is set to Enabled, it will enhance overall system performance by providing direct memory access for data transferring. The options are Enabled and Disabled. Check with your SGI sales or service representative for information on the availability of this option.
Intel VT-d
Select Enabled to enable Intel's Virtualization Technology support for Direct I/O VT-d by reporting the I/O device assignments to VMM through the DMAR ACPI Tables. This feature offers fully-protected I/O resource-sharing across the Intel platforms, providing the user with greater reliability, security and availability in networking and data-sharing. The settings are
Enabled and Disabled.
IOH PCIE Port1 Bifurcation
This feature allows the user to set IOH Bifurcation configuration for the PCI-E Port
The options are X4X4X4X4, X4X4X8, X8X4X4, X8X8.
IOH PCIE Max Payload Size
Some add-on cards perform faster with the coalesce feature, which limits the payload size to 128
MB; while others, with a payload size of 256 MB which inhibits the coalesce feature. Please refer to your add-on card user guide for the desired setting. The options are 256 MB and 128MB.
007-5681-001
007-5681-001
How To Change the Configuration Data
SouthBridge Configuration
This feature allows the user to configure the settings for the Intel ICH South Bridge chipset.
USB Functions
This feature allows the user to decide the number of on-board USB ports to be enabled. The
Options are: Disabled, 2 USB ports, 4 USB ports, 6 USB ports, 8 Ports, 10 Ports and 12 USB ports.
Legacy USB Support
Select Enabled to use Legacy USB devices. If this item is set to Auto, Legacy USB support will be automatically enabled if a legacy USB device is installed on the motherboard, and vise versa.
The settings are Disabled, and Enabled.
USB 2.0 Controller
Select Enabled to activate the on-board USB 2.0 controller. The options are Enabled and Disabled.
USB 2.0 Controller Mode
This setting allows you to select the USB 2.0 Controller mode. The options are Hi-Speed (480
Mbps) and Full Speed (12 Mbps).
BIOS EHCI Hand-Off
Select Enabled to enable BIOS Enhanced Host Controller Interface support to provide a workaround solution for an operating system that does not have EHCI Hand-Off support. When enabled, the EHCI Interface will be changed from the BIOS-controlled to the OS-controlled. The options are Disabled and Enabled.
XIDE/SATA Configuration
When this submenu is selected, the AMI BIOS automatically detects the presence of the IDE devices and displays the following items:
• SATA#1 Configuration
If Compatible is selected, it sets SATA#1 to legacy compatibility mode, while selecting Enhanced sets SATA#1 to native SATA mode. The options are Disabled, Compatible and Enhanced.
• Configure SATA#1 as
33
4: HPC Server BIOS Information
This feature allows the user to select the drive type for SATA#1. The options are IDE, RAID and
AHCI.
• SATA#2 Configuration
Selecting Enhanced will set SATA#2 to native SATA mode. The options are Disabled, and
Enhanced.
Primary IDE Master/Slave, Secondary IDE Master/Slave, Third IDE
Master, and Fourth IDE Master
These settings allow the user to set the parameters of Primary IDE Master/Slave, Secondary IDE
Master/Slave, Third and Fourth IDE Master slots. Hit <Enter> to activate the following submenu screen for detailed options of these items. Set the correct con.gurations accordingly. The items included in the submenu are:
• Type
Select the type of device connected to the system. The options are Not Installed, Auto, CD/DVD and ARMD.
• LBA/Large Mode
LBA (Logical Block Addressing) is a method of addressing data on a disk drive. In the LBA mode, the maximum drive capacity is 137 GB. For drive capacities over 137 GB, your system must be equipped with a 48-bit LBA mode addressing. If not, contact your manufacturer or install an
ATA/133 IDE controller card that supports 48-bit LBA mode. The options are Disabled and Auto.
• Block (Multi-Sector Transfer)
Block Mode boosts the IDE drive performance by increasing the amount of data transferred. Only
512 bytes of data can be transferred per interrupt if Block Mode is not used. Block Mode allows transfers of up to 64 KB per interrupt. Select Disabled to allow data to be transferred from and to the device one sector at a time. Select Auto to allow data transfer from and to the device occur multiple sectors at a time if the device supports it. The options are Auto and Disabled.
• PIO Mode
The IDE PIO (Programmable I/O) Mode programs timing cycles between the IDE drive and the programmable IDE controller. As the PIO mode increases, the cycle time decreases. The options are Auto, 0, 1, 2, 3, and 4.
34 007-5681-001
007-5681-001
How To Change the Configuration Data
Select Auto to allow the AMI BIOS to automatically detect the PIO mode. Use this value if the
IDE disk drive support cannot be determined. Select 0 to allow the AMI BIOS to use PIO mode
0. It has a data transfer rate of 3.3 MBs. Select 1 to allow the AMI BIOS to use PIO mode 1. It has a data transfer rate of 5.2 MBs.
Select 2 to allow the AMI BIOS to use PIO mode 2. It has a data transfer rate of 8.3 MBs. Select
3 to allow the AMI BIOS to use PIO mode 3. It has a data transfer rate of 11.1 MBs. Select 4 to allow the AMI BIOS to use PIO mode 4. It has a data transfer bandwidth of 32-Bits. Select
Enabled to enable 32-Bit data transfer.
• DMA Mode
Select Auto to allow the BIOS to automatically detect IDE DMA mode when the IDE disk drive support cannot be determined.
Select SWDMA0 to allow the BIOS to use Single Word DMA mode 0. It has a data transfer rate of 2.1 MBs.
Select SWDMA1 to allow the BIOS to use Single Word DMA mode 1. It has a data transfer rate of 4.2 MBs.
Select SWDMA2 to allow the BIOS to use Single Word DMA mode 2. It has a data transfer rate of 8.3 MBs.
Select MWDMA0 to allow the BIOS to use Multi Word DMA mode 0. It has a data transfer rate of 4.2 MBs.
Select MWDMA1 to allow the BIOS to use Multi Word DMA mode 1. It has a data transfer rate of 13.3 MBs.
Select MWDMA2 to allow the BIOS to use Multi-Word DMA mode 2. It has a data transfer rate of 16.6 MBs.
Select UDMA0 to allow the BIOS to use Ultra DMA mode 0. It has a data transfer rate of 16.6
MBs. It has the same transfer rate as PIO mode 4 and Multi Word DMA mode 2.
Select UDMA1 to allow the BIOS to use Ultra DMA mode 1. It has a data transfer rate of 25 MBs.
Select UDMA2 to allow the BIOS to use Ultra DMA mode 2. It has a data transfer rate of 33.3
MBs.
35
4: HPC Server BIOS Information
Select UDMA3 to allow the BIOS to use Ultra DMA mode 3. It has a data transfer rate of 66.6
MBs.
Select UDMA4 to allow the BIOS to use Ultra DMA mode 4. It has a data transfer rate of 100
MBs. The options are Auto, SWDMAn, MWDMAn, and UDMAn.
S.M.A.R.T. For Hard disk drives
Self-Monitoring Analysis and Reporting Technology (SMART) can help predict impending drive failures. Select Auto to allow the AMI BIOS to automatically detect hard disk drive support. Select
Disabled to prevent the AMI BIOS from using the S.M.A.R.T. Select Enabled to allow the AMI
BIOS to use the S.M.A.R.T. to support hard drive disk. The options are Disabled, Enabled, and
Auto.
32Bit Data Transfer
Select Enable to enable the function of 32-bit IDE data transfer. The options are Enabled and
Disabled.
IDE Detect Timeout (sec)
Use this feature to set the time-out value for the BIOS to detect the ATA, ATAPI devices installed in the system. The options are 0 (sec), 5, 10, 15, 20, 25, 30, and 35.
Clear NVRAM
This feature clears the NVRAM during system boot. The options are No and Yes.
Plug & Play OS
Selecting Yes allows the OS to configure Plug & Play devices. (This is not required for system boot if your system has an OS that supports Plug & Play.) Select No to allow the AMI BIOS to configure all devices in the system.
PCI Latency Timer
This feature sets the latency Timer of each PCI device installed on a PCI bus. Select 64 to set the
PCI latency to 64 PCI clock cycles. The options are 32, 64, 96, 128, 160, 192, 224 and 248.
PCI IDE BusMaster
36 007-5681-001
How To Change the Configuration Data
When enabled, the BIOS uses PCI bus mastering for reading/writing to IDE drives. The options are Disabled and Enabled.
Load Onboard LAN1 Option ROM/Load Onboard LAN2 Option ROM
Select Enabled to enable the onboard LAN1 or LAN2 Option ROM. This is to boot the computer using a network interface. The options are Enabled and Disabled.
Serial Port1 Address/ Serial Port2 Address
This option specifies the base I/O port address and the Interrupt Request address of Serial Port 1 and Serial Port 2. Select Disabled to prevent the serial port from accessing any system resources.
When this option is set to Disabled, the serial port physically becomes unavailable. Select
3F8/IRQ4 to allow the serial port to use 3F8 as its I/O port address and IRQ 4 for the interrupt address. The options for Serial Port1 are Disabled, 3F8/IRQ4, 3E8/IRQ4, 2E8/IRQ3. The options for Serial Port2 are Disabled, 2F8/IRQ3, 3E8/IRQ4, and 2E8/IRQ3.
Remote Access Configuration
Remote Access
This allows the user to enable the Remote Access feature. The options are Disabled and Enabled.
If Remote Access is set to Enabled, the following items will display:
• Serial Port Number
This feature allows the user decide which serial port to be used for Console Redirection. The options are COM 1 and COM 2.
• Serial Port Mode
This feature allows the user to set the serial port mode for Console Redirection. The options are 115200 8, n 1; 57600 8, n, 1; 38400 8, n, 1; 19200 8, n, 1; and 9600 8, n, 1.
• Flow Control
This feature allows the user to set the flow control for Console Redirection. The options are
None, Hardware, and Software.
Redirection After BIOS POST
Select Disabled to turn off Console Redirection after Power-On Self-Test (POST). Select Always to keep Console Redirection active all the time after POST.
007-5681-001 37
4: HPC Server BIOS Information
Note: This setting may not be supported by some operating systems.
Select Boot Loader to keep Console Redirection active during POST and Boot Loader. The options are Disabled, Boot Loader, and Always.
Terminal Type
This feature allows the user to select the target terminal type for Console Redirection. The options are ANSI, VT100, and VT-UTF8.
VT-UTF8 Combo Key Support
A terminal keyboard definition that provides a way to send commands from a remote console.
Available options are Enabled and Disabled.
Sredir Memory Display Delay
This feature de.nes the length of time in seconds to display memory information. The options are
No Delay, Delay 1 Sec, Delay 2 Sec, and Delay 4 Sec.
Hardware Health Monitor
This feature allows the user to monitor system health and review the status of each item as displayed.
CPU Overheat Alarm
This option allows the user to select the CPU Overheat Alarm setting which determines when the
CPU OH alarm will be activated to provide warning of possible CPU overheat.
Warning:
Any temperature that exceeds the CPU threshold temperature predefined by the CPU manufacturer may result in CPU overheat or system instability. When the CPU temperature reaches this predefined threshold, the CPU and system cooling fans will run at full speed.
The options are:
38 007-5681-001
007-5681-001
How To Change the Configuration Data
• The Early Alarm: Select this setting if you want the CPU overheat alarm (including the LED and the buzzer) to be triggered as soon as the CPU temperature reaches the CPU overheat threshold as predefined by the CPU manufacturer.
• The Default Alarm: Select this setting if you want the CPU overheat alarm (including the
LED and the buzzer) to be triggered when the CPU temperature reaches about 5oC above the threshold temperature as prede.ned by the CPU manufacturer to give the CPU and system fans additional time needed for CPU and system cooling. In both the alarms above, please take immediate action as shown below. (See the notes on P. 4-18 for more information.)
CPU Temperature/System Temperature
This feature displays current temperature readings for the CPU and the System.
The following items will be displayed for your reference only:
CPU Temperature
The CPU thermal technology that reports absolute temperatures (Celsius/Fahrenheit) has been upgraded to a more advanced feature by Intel in its newer processors. The basic concept is each
CPU is embedded by unique temperature information that the motherboard can read. This
‘Temperature Threshold’ or ‘Temperature Tolerance’ has been assigned at the factory and is the baseline on which the motherboard takes action during different CPU temperature conditions (i.e., by increasing CPU Fan speed, triggering the Overheat Alarm, etc.) Since CPUs can have different
‘Temperature Tolerances’, the installed CPU can now send information to the motherboard regarding what its ‘Temperature Tolerance’ is, and not the other way around. This results in better
CPU thermal management.
The manufacturer has leveraged this feature by assigning a temperature status to certain thermal conditions in the processor (Low, Medium and High). This makes it easier for the user to understand the CPU’s temperature status, rather than by just simply seeing a temperature reading
(i.e., 25 o C). The CPU Temperature feature will display the CPU temperature status as detected by the BIOS:
Low – This level is considered as the ‘normal’ operating state. The CPU temperature is well below the CPU ‘Temperature Tolerance’. The motherboard fans and CPU will run normally as configured in the BIOS (Fan Speed Control).
User intervention: No action required.
39
4: HPC Server BIOS Information
Medium – The processor is running warmer. This is a ‘precautionary’ level and generally means that there may be factors contributing to this condition, but the CPU is still within its normal operating state and below the CPU ‘Temperature Tolerance’. The motherboard fans and CPU will run normally as configured in the BIOS. The fans may adjust to a faster speed depending on the
Fan Speed Control settings.
User intervention: No action is required. However, consider checking the CPU fans and the chassis ventilation for blockage.
High – The processor is running hot. This is a ‘caution’ level since the CPU’s ‘Temperature
Tolerance’ has been reached (or has been exceeded) and may activate an overheat alarm.
User intervention: If the system buzzer and Overheat LED has activated, take action immediately by checking the system fans, chassis ventilation and room temperature to correct any problems.
Note: The system may shut down if it continues for a long period to prevent damage to the CPU.
The information provided above is for your reference only. For more information on processor thermal management, reference Intel’s Web site at www.Intel.com or contact your support representative.
System Temperature: The system temperature will be displayed (in degrees in Celsius and
Fahrenheit) as it is detected by the BIOS.
Fan Speed Control Monitor
This feature allows the user to decide how the system controls the speeds of the on-board fans. The
CPU temperature and the fan speed are correlative. When the CPU on-die temperature increases, the fan speed will also increase, and vice versa. Select Workstation if your system is used as a
Workstation. Select Server if your system is used as a Server. Select “Disabled, (Full Speed
@12V)” to disable the fan speed control function and allow the on-board fans to constantly run at the full speed (12V). The Options are: 1. Disabled (Full Speed), 2. Server Mode, 3. Workstation
Mode.
Fan1 ~ Fan 4 Reading
This feature displays the fan speed readings from fan interfaces Fan1 through Fan5.
CPU1 Vcore, CPU2 Vcore, +5Vin, +12Vcc (V), VPI DIMM, VP2 DIMM, 3.3Vcc (V), and
Battery Voltage
40 007-5681-001
IPMI Configuration
007-5681-001
How To Change the Configuration Data
ACPI Configuration
Use this feature to configure Advanced Con.guration and Power Interface (ACPI) power management settings for your system.
ACPI Version Features
The options are ACPI v1.0, ACPI v2.0 and ACPI v3.0. Please refer to ACPI's website for further explanation: http://www.acpi.info/.
ACPI APIC Support
Select Enabled to include the ACPI APIC Table Pointer in the RSDT pointer list.The options are
Enabled and Disabled.
APIC ACPI SCI IRQ
When this item is set to Enabled, APIC ACPI SCI IRQ is supported by the system. The options are Enabled and Disabled.
USB Device Wakeup from S3/S4
Select to Enabled to allow USB devices to wakeup from S3/S4 state. The options are Enabled and
Disabled.
High Performance Event Timer
Select Enabled to activate the High Performance Event Timer (HPET) that produces periodic interrupts at a much higher frequency than a Real-time Clock (RTC) does in synchronizing multimedia streams, providing smooth playback and reducing the dependency on other timestamp calculation devices, such as an x86 RDTSC Instruction embedded in the CPU. The High
Performance Event Timer is used to replace the 8254 Programmable Interval Timer. The options are Enabled and Disabled.
Intelligent Platform Management Interface (IPMI) is a set of common interfaces that IT administrators can use to monitor system health and to manage the system as a whole. For more information on the IPMI specifications, please visit Intel's website at www.intel.com.
Status of BMC
41
4: HPC Server BIOS Information
Baseboard Management Controller (BMC) manages the interface between system management software and platform hardware. This is an informational feature which returns the status code of the BMC micro controller.
View BMC System Event Log
This feature displays the BMC System Event Log (SEL). It shows the total number of entries of
BMC System Events. To view an event, select an Entry Number and press <Enter> to display the information as shown in the example below:
• Total Number of Entries
SEL Entry Number
SEL Record ID
SEL Record Type
Timestamp, Generator ID
Event Message Format User
Event Sensor Type
Event Sensor Number,
Event Dir Type
Event Data.
42
Clear BMC System Event Log
This feature is used to clear the BMC System Event Log. Caution: Any cleared information is unrecoverable. Make absolutely sure that you no longer need any data stored in the log before clearing the BMC Event Log.
Set LAN Configuration
Set this feature to configure the IPMI LAN adapter with a network address.
007-5681-001
007-5681-001
How To Change the Configuration Data
Channel Number - Enter the channel number for the SET LAN Configuration command. This is initially set to [1]. Press "+" or "-" on your keyboard to change the Channel Number.
Channel Number Status -This feature returns the channel status for the Channel Number selected above: "Channel Number is OK" or "Wrong Channel Number".
IP Address Configuration
Enter the IP address for this machine. This should be in decimal and in dotted quad form (i.e.,
192.168.10.253). The value of each three-digit number separated by dots should not exceed 255.
Parameter Selector
Use this feature to select the parameter of your IP Address configuration.
IP Address
The BIOS will automatically enter the IP address of this machine; however it may be over-ridden.
IP addresses are 6 two-digit hexadecimal numbers (Base 16, 0 ~ 9, A, B, C, D, E, F) separated by dots. (i.e., 00.30.48.D0.D4.60).
Current IP Address in BMC
This item displays the current IP address used for your IPMI connection.
MAC Address Configuration
Enter the Mac address for this machine. This should be in decimal and in dotted quad form (i.e.,
192.168.10.253). The value of each three-digit number separated by dots should not exceed 255.
Parameter Selector
Use this feature to select the parameter of your Mac Address configuration.
Mac Address
The BIOS will automatically enter the Mac address of this machine; however it may be over-ridden. Mac addresses are 6 two-digit hexadecimal numbers (Base 16, 0 ~ 9, A, B, C, D, E,
F) separated by dots. (i.e., 00.30.48.D0.D4.60).
Current Mac Address in BMC
43
4: HPC Server BIOS Information
This item displays the current Mac address used for your IPMI connection.
Subnet Mask Configuration
Subnet masks tell the network which subnet this machine belongs to. The value of each three-digit number separated by dots should not exceed 255.
Parameter Selector
Use this feature to select the parameter of your Subnet Masks con.guration.
Subnet Masks
This item displays the current subnet masks setting for your IPMI connection.
SEL PEF Configuration
Set PEF Configuration
Set this feature to configure the Platform Event Filter (PEF). PEF interprets BMC events and performs actions based on pre-determined settings or 'traps' under IPMI 1.5 speci.cations. For example, powering the system down or sending an alert when a triggering event is detected.
The following will appear if PEF Support is set to Enabled. The default is Disabled.
PEF Action Global Control -These are the different actions based on BMC events. The options are
Alert, Power Down, Reset System, Power Cycle, OEM Action, Diagnostic Interface.
Alert Startup Delay - This feature inserts a delay during startup for PEF alerts.
The options are Enabled and Disabled.PEF Alert Startup Delay -This sets the pre-determined time to delay PEF alerts after system power-ups and resets. Refer to Table 24.6 of the IPMI 1.5
Specification for more information at www.intel.com. The options are:
No Delay, 30 sec, 60 sec, 1.5 min, 2.0 min.
Startup Delay - This feature enables or disables startup delay. The options are Enabled and
Disabled.
PEF Startup Delay -This sets the pre-determined time to delay PEF after system power-ups and resets. Refer to Table 24.6 of the IPMI 1.5 Speci.cation for more information at www.intel.com.
The options are No Delay, 30 sec, 60 sec, 1.5 min, 2.0 min.
44 007-5681-001
How To Change the Configuration Data
Event Message for PEF Action - This enables of disables Event Messages for PEF action. Refer to Table 24.6 of the IPMI 1.5 Speci.cation for more information at www.intel.com. The options are Disabled and Enabled.
BMC Watch Dog Timer Action
Allows the BMC to reset or power down the system if the operating system hangs or crashes. The options are Disabled, Reset System, Power Down, Power Cycle.
BMC Watch Dog TimeOut [Min:Sec]
This option appears if BMC Watch Dog Timer Action (above) is enabled. This is a timed delay in minutes or seconds, before a system power down or reset after an operating system failure is detected. The options are [5 Min], [1 Min], [30 Sec], and [10 Sec].
The DMI Event Log
Security Settings
007-5681-001
View Event Log
Use this option to view the System Event Log.
Mark all events as read
This option marks all events as read. The options are OK and Cancel.
Clear event log
This option clears the Event Log memory of all messages. The options are OK and Cancel.
The AMI BIOS provides a Supervisor and a User password. If you use both passwords, the
Supervisor password must be set first.
Supervisor Password
This item indicates if a supervisor password has been entered for the system. Clear means such a password has not been used and Set means a supervisor password has been entered for the system.
User Password:
45
4: HPC Server BIOS Information
This item indicates if a user password has been entered for the system. Clear means such a password has not been used and Set means a user password has been entered for the system.
Change Supervisor Password
Select this feature and press <Enter> to access the submenu, and then type in a new Supervisor
Password.
User Access Level
(Available when Supervisor Password is set as above)
Available options are Full Access: grants full User read and write access to the Setup Utility, View
Only: allows access to the Setup Utility but the fields cannot be changed, Limited: allows only limited fields to be changed such as Date and Time, No Access: prevents User access to the Setup
Utility.
Change User Password
Select this feature and press <Enter> to access the submenu, and then type in a new User
Password.
Clear User Password
(Available only if User Password has been set)
This item allows you to clear a user password after it has been entered.
Password Check
This item allows you to check a password after it has been entered. The options are Setup and
Always.
Boot Sector Virus Protection
When Enabled, the AMI BOIS displays a warning when any program (or virus) issues a Disk
Format command or attempts to write to the boot sector of the hard disk drive. The options are
Enabled and Disabled.
Boot Configuration
Use this feature to configure boot settings.
Boot Device Priority
46 007-5681-001
Exit Options
007-5681-001
How To Change the Configuration Data
This feature allows the user to specify the sequence of priority for the Boot Device. The settings are 1st boot device, 2nd boot device, 3rd boot device, 4th boot device, 5th boot device and
Disabled.
1st Boot Device - [USB: XXXXXXXXX]
2nd Boot Device - [CD/DVD: XXXXXXXXX]
Hard Disk Drives
This feature allows the user to specify the boot sequence from all available hard disk drives. The settings are Disabled and a list of all hard disk drives that have been detected (i.e., 1st Drive, 2nd
Drive, 3rd Drive, etc.)
• 1st Drive - [SATA: XXXXXXXXX]
Removable Drives
This feature allows the user to specify the boot sequence from available Removable Drives. The settings are 1st boot device, 2nd boot device, and Disabled.
1st Drive - [USB: XXXXXXXXX]
2nd Drive
XCD/DVD Drives
This feature allows the user to specify the boot sequence from available CD/DVD Drives (i.e., 1st
Drive, 2nd Drive, etc.)
Select the Exit tab from the AMI BIOS Setup Utility screen to enter the Exit BIOS Setup screen.
Save Changes and Exit
When you have completed the system con.guration changes, select this option to leave the BIOS
Setup Utility and reboot the computer, so the new system configuration parameters can take effect.
Select Save Changes and Exit from the Exit menu and press <Enter>.
Discard Changes and Exit
47
4: HPC Server BIOS Information
Select this option to quit the BIOS Setup without making any permanent changes to the system configuration, and reboot the computer. Select Discard Changes and Exit from the Exit menu and press <Enter>.
Discard Changes
Select this option and press <Enter> to discard all the changes and return to the AMI BIOS Utility
Program.
Load Optimal Defaults
To set this feature, select Load Optimal Defaults from the Exit menu and press <Enter>. Then, select OK to allow the AMI BIOS to automatically load Optimal Defaults to the BIOS Settings.
The Optimal settings are designed for maximum system performance, but may not work best for all computer applications.
Load Fail-Safe Defaults
To set this feature, select Load Fail-Safe Defaults from the Exit menu and press <Enter>. The
Fail-Safe settings are designed for maximum system stability, but not for maximum performance.
BIOS Error Beep Codes
During the POST (Power-On Self-Test) routines, which are performed each time the system is powered on, errors may occur.
Non-fatal errors are those which, in most cases, allow the system to continue the boot-up process.
The error messages normally appear on the screen.
Fatal errors are those which will not allow the system to continue the boot-up procedure. If a fatal error occurs, you should consult with your system manufacturer for possible repairs.
48 007-5681-001
BIOS Error Beep Codes
BIOS Error Beep Code List
The following list of error codes my be helpful in diagnosing certain system problems.
Beep Code
1 beep
5 short beeps + 1 long beep
8 beeps 1 continuous beep
(with the front panel OH
LED on)
BIOS Error Beep Codes
Error Message
Refresh
Memory error
Display memory read/write error System
Overheat
Description
Circuits have been reset. (Ready to power up)
No memory detected in the system
Video adapter missing or with faulty memory 1 continuous beep with the front panel OH LED on
007-5681-001 49
Appendix A
A.
Technical Specifications
This appendix contains technical specification information about your system.
Server Specifications and Features
Table A-1 shows the physical specifications of the SGI CloudRack C2 server system.
Table A-1
Voltage range
Cycles per second
System Cooling
Phase required
Power supply
Hard drive bays
Power cable
PCIe slots
SGI CloudRack C2 Enclosure Physical Specifications
System Features
Height
Width
Depth
Weight (full) maximum
Specification
78.7 in. (42U), 47.2 in. (24U)
24 in.
46 in.
Approximately 2135 lbs. (42U), 1375 lbs. (24U). Shipping weight will be higher.
200-240 VAC (180-264 VAC tolerance range)
50 or 60 Hz (single-phase AC)
42 system fans (up to 12 dedicated power supply fans)
Single-phase or three-phase
12 2900-watt power supplies
Two per (HPC) compute tray. Up to six with other tray configurations.
Up to six pluggable cords
Optional low-profile (x16) PCI-Express slots on HPC compute trays
007-5681-001 51
A: Technical Specifications
Environmental Specifications
Table A-2 lists the environmental specifications of the system.
Table A-2
Feature
Temperature tolerance
(operating)
Temperature tolerance
(non-operating)
Relative humidity
Environmental Specifications
Specification
+5
°
C (41
°
F) to +35
°
C (95
°
F) (up to 1500 m / 5000 ft.)
+5
°
C (41
°
F) to +30
°
C (86
°
F) (1500 m to 3000 m /5000 ft. to 10,000 ft.)
-40
°
C (-40
°
F) to +60
°
C (140
°
F)
Cooling requirement
Maximum altitude
Acoustical noise level
10% to 80% operating (no condensation)
8% to 95% non-operating (no condensation)
Ambient air or optional water cooling
10,000 ft. (3,049 m) operating
40,000 ft. (12,195 m) non-operating
Less than 65 dBa maximum
52 007-5681-001
advertisement
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Related manuals
advertisement
Table of contents
- 20 Placing Rackmounted Systems
- 20 Choosing a Setup Location
- 20 Rack Precautions
- 20 Server Precautions
- 21 Rack Operating Considerations
- 21 Ambient Operating Temperature
- 21 Reduced Airflow
- 21 Mechanical Loading
- 21 Circuit Overloading
- 22 Reliable Ground
- 22 Providing Power
- 24 Troubleshooting the System
- 25 Enclosure Power Supply Status LEDs
- 26 Individual Tray LEDs
- 27 System Fan Failure
- 29 NICs 1 and 2 (HPC trays only)
- 29 Power
- 29 Tray Control Panel Button Examples
- 30 Overheat/Fan Fail LED on Back of System
- 31 Starting the BIOS Setup Utility
- 32 How To Change the Configuration Data
- 32 Starting the Setup Utility
- 32 Main Setup Screen
- 34 Advanced Setup Configurations
- 34 BOOT Features
- 45 Remote Access Configuration
- 46 Hardware Health Monitor
- 49 IPMI Configuration
- 52 SEL PEF Configuration
- 53 The DMI Event Log
- 53 Security Settings
- 55 Exit Options
- 56 BIOS Error Beep Codes
- 57 BIOS Error Beep Code List
- 59 Server Specifications and Features
- 60 Environmental Specifications