CS-Storm™ 500NX Hardware Guide

CS-Storm™ 500NX Hardware Guide
CS-Storm™ 500NX Hardware Guide
H-6153
Contents
Contents
About the CS-Storm 500NX Hardware Guide...........................................................................................................3
CS-Storm 500NX Server...........................................................................................................................................4
Drive Configuration and Status........................................................................................................................6
Controls and Connectors.................................................................................................................................7
Fan Module and Power Supply Numbering...................................................................................................10
GPU Tray.................................................................................................................................................................11
GPU SXM Board...........................................................................................................................................14
Midplane..................................................................................................................................................................17
Motherboard Tray....................................................................................................................................................18
Drive Backplane............................................................................................................................................19
Cabling..........................................................................................................................................................20
Motherboard..................................................................................................................................................22
Component Locations.........................................................................................................................23
Architecture and Features...................................................................................................................26
Memory Support and Population.........................................................................................................28
Rack Power and Server Configuration....................................................................................................................31
Replacement Procedures........................................................................................................................................37
Server Lift Setup............................................................................................................................................37
Tray Removal and Installation.......................................................................................................................39
Tray Cover Removal......................................................................................................................................40
GPU Module and Heatsink Replacement......................................................................................................42
Motherboard Processor and Heatsink Replacement.....................................................................................47
DIMM Replacement.......................................................................................................................................53
Drive Removal...............................................................................................................................................55
Drive Installation..................................................................................................................................56
Fan and Power Supply Replacement............................................................................................................57
H-6153
CS-Storm 500NX Hardware Guide
2
About the CS-Storm 500NX Hardware Guide
About the CS-Storm 500NX Hardware Guide
The Cray® CS-Storm 500NX™ Hardware Guide describes the GPU server and main system assemblies and
components. This guide does not include information about peripheral I/O switches or network fabric components.
Refer to the manufacturer's documentation for that equipment.
Document Versions
H-6150:
May 2017. Original version. This guide describes all major components of the
CS-Storm 500NX server.
Scope and Audience
This document provides information about the CS-Storm 500NX system. Installation and service information is
provided for users who have experience maintaining high performance computing (HPC) equipment. Installation
and maintenance tasks should be performed by experienced technicians in accordance with the service
agreement.
The information is presented in topic-based format and does not include chapters, appendices, or section
numbering.
Feedback
Visit the Cray Publications Portal at http://pubs.cray.com. Email your comments and feedback to pubs@cray.com.
Your comments are important to us. We will respond within 24 hours.
H-6153
CS-Storm 500NX Hardware Guide
3
CS-Storm 500NX Server
CS-Storm 500NX Server
The Cray® CS-Storm 500NX™ server has a 4U 19-inch wide rackmount chassis. Each CS-Storm 500NX server
contains two Intel® Xeon® E5-2600 v4 processors, 24 DIMMs (2400 MHz DDR4), up to 3 TB of memory, and 16x
2.5" drive bays. Each 500NX server supports up to eight NVIDIA® Tesla® P100 GPU accelerators using the
NVIDIA NVLink ™ high-speed interconnect.
Figure 1. CS-Storm 500NX Server
Table 1. CS-Storm 500NX Features
Feature
Description
Chassis
●
19-inch wide, 4U rackmounted chassis
●
Up to 12 CS-Storm 500NX chassis in a 48RU rack. 10 in a 42RU rack.
●
Chassis weight, fully configured: approximately 135 lb (61 kg)
●
Dimensions: (WxHxD) 17.6 x 7.0 x 31.7 inches (447 x 178 x 805 mm)
●
Up to 8 NVIDIA Tesla P100 (SXM2, 300W)
●
NVLINK high-speed interconnect (up to 80 GB/s)
GPUs
Processors
H-6153
Two Intel Xeon E5-2600 v4 family (up to 145W TDP)
CS-Storm 500NX Hardware Guide
4
CS-Storm 500NX Server
Feature
Description
Memory Capacity
24 DIMM slots (Up to 3 TB, ECC 3DS LRDIMM , DDR4 - 2400 MHz)
Expansion slots
PCIe 3.0 slots:
Network Interconnect
Storage
Power Supplies
●
Four x16, low-profile (GPU tray)
●
Two x8 (Motherboard tray)
Support of multiple interconnects, including:
●
InfiniBand FDR/EDR
●
Gigabit Ethernet (10/40 GbE)
●
Intel® Omni-Path (56/100 Gb/s)
●
16 hot-swap 2.5 inch SATA/SAS drive bays
●
Up to 8 drives can be NVMe, and the rest are SAS or SATA
●
SAS/SATA/NVMe backplane in motherboard tray
Four 2200W AC power supplies (2+2 redundancy, titanium level efficiency)
Each power supply contains a cooling fan
Cooling
Air cooled system
●
Eight 92 mm cooling fans
●
Active processor heatsinks
●
Passive GPU heatsinks
●
Dual air shrouds in GPU tray
IO Ports
●
2 LAN ports, 1 GbE, RJ45
(front panel)
●
1 dedicated IPMI LAN port, RJ45
●
2 USB 3.0 ports
●
1 VGA connector
Operating Environment
Operating Temperature:
10º to 35º C (50º to 95º F)
Non-operating Temperature:
-40º to 70º C (-40º to 158º F)
Operating Relative Humidity:
8% to 90% (non-condensing)
Non-operating Relative Humidity:
5 to 95% (non-condensing)
H-6153
CS-Storm 500NX Hardware Guide
5
CS-Storm 500NX Server
Drive Configuration and Status
The CS-Storm 500NX server supports sixteen 2.5-inch storage drives. Eight hybrid connectors on the drive
backplane support NVMe or SAS/SATA drives, and the rest support SAS or SATA as shown below.
The NVMe ports provide high-speed, low-latency connections directly from the CPU to NVMe solid-state drives
(SSDs). This greatly increases SSD throughput and significantly reduces storage device latency by simplifying
driver and software requirements resulting from the direct PCIe interface between the CPU and the NVMe SSDs.
Figure 2. Drive Bay Configuration
Drive Carrier LEDs
Each SAS/SATA/NVMe drive is mounted in a drive carrier with two status LEDs on the front of the carrier. The
drives are mounted in these drive carriers to simplify their installation and removal from the chassis. These
carriers also help promote proper airflow through the drive bays.
LED
Color
Blinking Pattern
Drive Behavior
Activity LED
Blue
Solid on
Idle SAS/NVMe drive installed
Blue
Blinking/flashing
I/O activity
Off
Off
Idle SATA drive
Red
Solid on
Failure of drive with RSTe support
Red
Blinking slow ( 1 Hz)
Rebuild drive with RSTe support
Red
Blinking with two blinks and
one stop at 1 Hz
Hot spare for drive with RSTe support
Red
On for five seconds, then off
Power on for drive with RSTe support
Red
Blinking fast ( 4 Hz)
Identify drive with RSTe support
Green
Solid on
Safe to remove NVMe drive
Amber
Blinking at 1 Hz
Attention state - do not remove NVMe
drive
Status/Failure LED
H-6153
CS-Storm 500NX Hardware Guide
6
CS-Storm 500NX Server
Controls and Connectors
The CS-Storm 500NX server includes a control panel on the front that houses power buttons and status
monitoring lights. The following figure shows CS-Storm 500NX controls, LEDs, and connectors on the front of the
server.
Figure 3. CS-Storm 500NX Front View
IO Ports
The front I/O ports include:
●
Two USB 3.0 ports
●
Two 1 Gb Ethernet LAN ports
●
One dedicated IPMI 2.0 LAN port
●
One VGA (monitor) port.
IPMI dedicated LAN port. The Intelligent Platform Management Interface (IPMI) 2.0 architecture provides a
command line interface to the Baseboard Management Controller (BMC) on the 500NX motherboard. IPMI is a
standardized protocol used to manage and monitor a system in-band or out-of-band. As a result, IPMI operates
independently of the host OS and interfaces directly with the hardware. IPMI allows a system administrator to
monitor system health and manage computer events from a remote location. IPMI provides features for
monitoring, logging, recovery, and inventory control through hardware and firmware. These functions are provided
independent of the main CPU, BIOS, and OS.
VGA port. The GPUs process complex image calculations and then route the data out through the VGA port on
the motherboard. A jumper on the motherboard can be set to disable the VGA port.
H-6153
CS-Storm 500NX Hardware Guide
7
CS-Storm 500NX Server
Front Control Panel Buttons and LEDs
Power. The main power switch applies or removes primary power from the power supply to
the server but maintains standby power. Unplug the server to remove all power.
BMC reset and UID. This button has a dual function, depending on the JBMC_BTN jumper.
This button is recessed from the front of the server. The UID LED is embedded inside the
button. Use a pointed object, such as a pen or a paper clip to push the button. The BMC
reset button either:
●
Turns on or off the blue UID light that is used to visually identify a specific server
installed in the rack or among several racks/cabinets.
●
Resets the BMC that provides IPMI support.
System Reset Button/Information LED
System Reset Button. The reset button is used to reboot the server. This button is
recessed from the front of the server. Use a pointed object, such as a pen or a paper clip to
push the button.
Information LED. Alerts operator to several states, as noted in the table below. The LED is
embedded inside the button.
Information LED
Status
Description
Red (solid on)
An overheat condition has occurred. (This may be caused by cable congestion.)
Red (blinking - 1 Hz)
Fan failure, check for an inoperative fan.
Red (blinking - 0.25
Hz)
Power failure, check for a non-operational power supply.
Blue (solid on)
System ID has been activated. Used to visually identify a specific server among all the
other servers in a rack.
Blue (blinking)
Remote UID is on. Use this function to identify the server from a remote location.
H-6153
CS-Storm 500NX Hardware Guide
8
CS-Storm 500NX Server
I/O Connector LEDs
LAN LED Indicators. The activity LED indicates network connection when on, and transmit∕receive activity when
blinking. The color of the link/speed LED indicates the data transfer speed. The following table provides an
overview of the LEDs.
Link LED
(speed)
Activity LED
Ethernet LAN LEDs
Link
Activity
Off
10 Mbps
Yellow (solid)
Active connection
Green
100 Mbps
Yellow (blinking)
Transmit/Receive activity
Amber
1 Gbps
LAN Port Symbols. There are two symbols that provide status information about the LAN ports.
NIC 1. When flashing, indicates network activity on GLAN1.
NIC 2. When flashing, indicates network activity on GLAN2.
Power Supply LEDs
An LED on the rear of the power supply unit displays the status.
LED Status
Description
Green (solid)
Indicates the power supply is on.
Amber (solid)
Indicates the power supply is plugged in and turned off, or the system is off but is in an
abnormal state.
Amber (blinking)
When blinking, the power supply's temperature has reached 63° C. The server will
automatically power-down when the power supply temperature reaches 70° C and
restart when the power supply temperature goes below 60° C.
H-6153
CS-Storm 500NX Hardware Guide
9
CS-Storm 500NX Server
Fan Module and Power Supply Numbering
The chassis contains eight 9-cm exhaust fans that provide cooling for the system. Four of these fans are
combined with the four hot-plug power supplies. There is no need to power down the system when switching fans
or power supplies.
In the event of a power supply failure, the remaining power modules automatically take over. The failed power
module can be replaced without powering-down the system. Replace modules with the same model.
An amber light on the power supply is illuminated when the power is switched off. A green light indicates that the
power supply is operating.
The following figure shows the numbering of the fans and power supplies based on sensor data provided through
the motherboard BMC (ipmi utilities).
Figure 4. Fan Module and Power Supply Numbering
H-6153
CS-Storm 500NX Hardware Guide
10
GPU Tray
GPU Tray
The CS-Storm 500NX server supports eight NVIDIA Tesla P100 GPUs. The GPUs are installed on the SXM board
in the GPU tray. The Tesla P100 uses the NVIDIA NVLink™ high-speed, high-bandwidth interconnect. All main
components of the GPU tray are shown in the following figure.
Figure 5. GPU Tray
GPU Numbering
Numbering of the GPUs appears differently, depending on the interface. GPU numbers are silk screened on the
SXM board, displayed when using Intelligent Platform Management Interface (IPMI) commands, and referenced
by the operating system. The different numbering systems are listed below.
Important: All GPU numbers shown in text and figures throughout this document follow the operating system
(OS) numbering scheme.
H-6153
CS-Storm 500NX Hardware Guide
11
GPU Tray
GPU Numbering
SXM
Board
GPU 0
GPU 1
GPU 2
GPU 3
GPU 4
GPU 5
GPU 6
GPU 7
ipmi
GPU 1
GPU 2
GPU 3
GPU 4
GPU 5
GPU 6
GPU 7
GPU 8
OS
GPU 2
GPU 3
GPU 0
GPU 1
GPU 6
GPU 7
GPU 4
GPU 5
Bus
Number
09:00.0
0A:00.0
04:00.0
05:00.0
89:00.0
8A:00.0
85:00.0
86:00.0
Tesla P100 GPU
This high-speed bidirectional interconnect scales applications across multiple GPUs for 5X higher performance
The Tesla P100 uses the NVIDIA NVLink™ high-speed GPU interconnect to provide data transfers between
GPUs. The NVLink connection provides GPU-to-GPU data transfers at up to 160 Gb/s of bidirectional bandwidth
—5x the bandwidth of PCIe Gen 3 x16. The GPUs still communicate using the PCI-e protocol when transferring
data to and from the CPUs, memory, and expansion slots.
The NVIDIA® Tesla® P100 GPU accelerator uses the NVIDIA Pascal™ GPU architecture to deliver in excess of
3,500 embedded cores and flexible mixed-precision computing options. The P100 offers customers a choice of
double-precision, single-precision or half-precision compute operation, empowering users to trade off precision
and performance for their own specific application requirements.
H-6153
CS-Storm 500NX Hardware Guide
12
GPU Tray
Figure 6. Tesla P100 GPU
Specifications
GPU Architecture:
NVIDIA Pascal
Performance:
●
5.3 TFLOPS of double precision floating point (FP64) performance
●
10.6 TFLOPS of single precision (FP32) performance
●
21.2 TFLOPS of half-precision (FP16) performance
NVIDIA CUDA® Cores:
3584
GPU Memory:
16 GB CoWoS HBM2
Memory Bandwidth:
732 GB/s
Interconnect:
NVIDIA NVLink
H-6153
CS-Storm 500NX Hardware Guide
13
GPU Tray
Max Power Consumption:
300 W
Thermal Solution:
Passive heatsink
Form Factor:
SXM2
GPU SXM Board
The GPU SXM board uses the NVIDIA Cube Mesh NVLink architecture. A direct connection between all GPUs is
provided through single NVLink connections (peak of 20 GB/s). The GPU SXM board has four PCIe 3.0 x16 PCIe
slots to support low-profile expansion cards in the front of the chassis.
A direct connection from the GPUs to an external high-speed interface (HSI) is provided through a PCIe switch to
an x16 slot in the front of the GPU tray. Each PCIe switch is root to two GPU modules and one PCIe3.0 x16 slot
on the GPU SXM board.
A block diagram showing the NVLink and PCIe interconnections, and a figure showing major SXM board
components are shown below.
H-6153
CS-Storm 500NX Hardware Guide
14
GPU Tray
Green arrows
NVLink (GPU-to-GPU) transfers
Slot 1
Slot 2
Figure 7. Interconnect Block Diagram
Black arrows
PCIe 3.0 transfers
x8
x8
CPU
1
CPU
2
Motherboard
Midplane
H-6153
x16
PCIe
Switch
PCIe
Switch
PCIe
Switch
GPU
7
GPU
4
GPU
3
GPU
0
GPU
6
GPU
5
GPU
2
GPU
1
x16
x16
Slot 3
PCIe
Switch
Slot 4
x16
Slot 2
x16
x16
Slot 1
GPU SXM
x16
board
CS-Storm 500NX Hardware Guide
x16
15
GPU Tray
Figure 8. GPU SXM Board Components
JMPx PCIe bus/control connectors
JMP8 JMP7 JMP6 JMP5
JMP4 JMP3 JMP2 JMP1
JDx Power
connectors
JD4
JD3
JD2
JD1
PCIe switch (4x)
GPU 7
GPU 4
GPU 3
GPU 0
Guide
pin
holes
NVLink
connector
GPU 6
Slot 1
PCIe 3.0 x16
H-6153
GPU 5
GPU 2
Slot 2
PCIe 3.0 x16
Slot 3
PCIe 3.0 x16
GPU 1
GPU numbers shown here are
based on the numbers/positions
reported through the operating
system. They do not match what
is stenciled on the SXM board.
Slot 4
PCIe 3.0 x16
CS-Storm 500NX Hardware Guide
16
Midplane
Midplane
The 4U midplane board connects the GPU SXM board to the motherboard. The midplane mounts to the fanpower supply cage in the back of the CS-Storm 500NXserver. To access the midplane, the fan-power supply cage
must be removed from the back of the server chassis.
The following figure shows all connectors on the front side of the midplane.
Figure 9. Midplane Board
Front side (facing GPU and motherboard trays)
Mounting screws (16x)
J43 J42
JMP18
JMP8
JMP15
J41 J40
JMP14
JMP4
JMP5
PCIe bus/control connectors
(to/from the tray)
H-6153
J39 J38
JMP11
J37 J36
JMP1
J35 J34 J33 J32
Row of connectors
for GPU tray
Row of connectors
for motherboard tray
Main power connectors
(to motherboard/GPU SXM board)
CS-Storm 500NX Hardware Guide
17
Motherboard Tray
Motherboard Tray
The CS-Storm 500NX motherboard supports two Intel Xeon E5-2600 v4 processors and 24 DIMM slots across
eight memory channels.
The motherboard offers two PCIe expansion card slots. The motherboard tray, like the GPU tray, has a cover that
can be removed to access internal components.
All main components of the motherboard tray are shown in the following figure.
Figure 10. Motherboard Tray
H-6153
CS-Storm 500NX Hardware Guide
18
Motherboard Tray
Drive Backplane
The storage device (drive) backplane connects the drives to the motherboard. The backplane mounts to the back
of the drive cage inside the motherboard tray.
The backplane is a 16-port, 2U, SAS3, 12 Gbps, hybrid backplane that supports up to eight 2.5-inch SAS3/SATA3
HDD/SSD drives and eight SAS3/SATA3/NVMe drives.
The backplane has six 4-pin power connectors that cable to three 8-pin power connectors on the motherboard.
The backplane has two mini-SAS connectors (compatible with SATA drives) that cable to I-SATA/S-SATA ports on
the motherboard.
The following figure shows the connectors on both sides of the backplane.
Figure 11. Drive Backplane
Front side
(facing drives)
0
HDD Activity
LED
HDD Status/Fail
LED
7
NVMe Slots
Slot Groups:
Slots 0-7 support SAS3/SAS2/SATA3 HDD/SSD.
Slots 8-15 support both SAS3/SAS2/SATA3 HDD/SSD and NVMe storage devices.
Every 4 slots are designed as one group (slots 0-3 / 4-7 / 8-11 / 12-15).
Drives within a group can be either SAS/SATA or NVMe, but can not be mixed.
HDD 12-15
Back side
(facing
motherboard)
NVMe 7
NVMe 6
HDD 8-11
NVMe 5
NVMe 4
NVMe 3
NVMe 2
HDD 4-7
HDD 0-3
mini-SAS HD connectors
NVMe 1
NVMe 0
OCuLink
connectors
JPW3
(slot 13-15)
JPW2
JPW1
(slot 10-12) (slot 8-9)
JPW6
JPW5
(slot 6-7) (slot 3-5)
JPW4
(slot 0-2)
Row of power
connectors
(to motherboard)
Power Connectors:
All power connectors (JPW1-6) must be connected.
Port NVMe 0 must be connected for NVMe drives 0-3 to function properly.
Port NVMe 4 must be connected for NVMe drives 4-7 to function properly.
H-6153
CS-Storm 500NX Hardware Guide
19
Motherboard Tray
Cabling
The Cray® CS-Storm 500NX™ server has 16 disk slots. Example cabling diagrams for common configurations are
shown in the following two figures.
●
Eight disk slots are supported through two mini-SAS HD connectors on the motherboard and disk backplane
●
Eight remaining disk slots are supported through internal ports on a PCIe SATA adapter card
●
Eight slots supported NVMe drives (8-15) are supported through internal ports on PCIe NVMe adapter cards
Figure 12. Cabling Connections for 16 SATA SSDs
Important: Cables in the server have been
carefully routed to prevent them from
blocking the flow of cooling air that moves
through the chassis. When reconnecting
any cables, take care to route them as
they were originally routed.
SAS 0-3
Internal
8 Port
SATA
Adapter Card
SAS 4-7
SSD 4-7
JPW3
JPW2 JPW1
SSD 0-3
JPW6 JPW5
JPW4
CBL-PWEX-0979
Slot 1
I-SATA
0-3
CBL-SAST-0819
CBL-SAST-0550
CBL-SAST-0550
CN16
Slot 2
CN13: To slot 2, from CPU1-CN1
CN12: To slot 2, from CPU1-CN2
CBL-PWEX-0979
CN13
(CPU1) CN12/CN13 <--> Slot 2
CN12
CN17: To slot 1, from CPU2-CN1
CN16: To slot 1, from CPU2-CN2
CN17
CBL-PWEX-0979
(CPU2) CN16/CN17 <--> Slot 1
S-SATA
0-3
CBL-SAST-0819
JPW18
JPW19
CPU1
CN2
CPU1
CN1
CPU 1
H-6153
CS-Storm 500NX Hardware Guide
CPU2
CN2
CN15
Fan 3
(CPU2)
CN14
CN11
Fan 1
(CPU1)
CN10
JPW17
CPU2
CN1
CPU 2
20
Motherboard Tray
Figure 13. Cabling Connections for SATA/NVME SSDs
8 SATA SSDs in slots 0 - 7
4 NVMe SSDs in slots 8 - 11
NVMe 0
CH1
CH2
Internal
4 Port
NVMe
Adapter Card
CH3
CH4
NVMe 1
NVMe 2
NVMe 3
CBL-SAST-0974-1
SSD 0-3
Slot 1
Slot 2
SSD 4-7
8 SATA SSDs in slots 0 - 7
8 NVMe SSDs in slots 8 - 15
NVMe 4
CH1
Internal
4 Port
NVMe
Adapter Card
NVMe 0
NVMe 5
CH2
NVMe 6
CH3
NVMe 7
CH4
CH1
CH2
Internal
4 Port
NVMe
Adapter Card
CH3
CH4
NVMe 3
SSD 0-3
Slot 1
SSD 4-7
Slot 2
NVMe 2
CBL-SAST-0974-1
CBL-SAST-0974-1
H-6153
NVMe 1
CS-Storm 500NX Hardware Guide
21
Motherboard Tray
CS-Storm 500NX Motherboard
The CS-Storm 500NX motherboard supports dual Intel® Xeon® E5-2600 v4 series processors that provides
balanced performance, power efficiency, and features to address the diverse needs of next-generation data
centers. With the PCH C612 chipset, the motherboard includes the Intel® Intelligent Power Node Manager,
Management Engine (ME), and Digital Media Interface (DMI). The motherboard has 24 DIMM slots supporting up
to 1536 GB of RDIMM (Registered) and LRDIMM (Load Reduced) ECC/Non-ECC DDR4 memory at 2400 MT/s,
providing higher compute performance and secure encryption
Figure 14. CS-Storm 500NX Motherboard
Table 2. CS-Storm 500NX Motherboard Specifications
Feature
Description
Processor Support
Dual Intel E5-2600 v4 series processors (LGA 2011, R3 socket)
H-6153
CS-Storm 500NX Hardware Guide
22
Motherboard Tray
Feature
Description
Memory
●
24 DIMM slots
●
1536 GB LRDIMM or RDIMM memory (maximum)
●
DDR4 data transfer rates of 2400/2133/1866/1600 MT/s
Chipset
Intel C612 Platform Controller Hub (PCH)
External I/O Ports
●
Two 1 Gb/E LAN (RJ45) ports
●
One dedicated IPMI 2.0 LAN (RJ45) port
●
Two USB 3.0 ports
●
One VGA port
●
One x8 slot supported by CPU2 (Slot1)
●
One x8 slot supported by CPU1 (Slot2)
●
Two PSU main power to midplane connectors (48-pin)
●
10 power connectors (12V, 8-pin)
●
Two HDD power connectors (12V, 8-pin)
●
Two SATA DOM connectors (5V, 3-pin)
Fan Connectors
●
Eight chassis/CPU fan connectors (12V, 4-pin)
SATA Ports
10 SATA 3.0 ports: 4 I-SATA, 2 I-SATA with SuperDOM support, and 4 S-SATA.
RAID 0, 1, 5 and 10 can be enabled.
TPM/Port 80 Connector
A Trusted Platform Module/Port 80 connector is available to enhance system
performance and improve data security
PCIe 3.0 Expansion Slots
Power Connections
Component Locations
Motherboard component locations, connector types, and jumper settings are shown in the following figure.
H-6153
CS-Storm 500NX Hardware Guide
23
Motherboard Tray
Figure 15. Motherboard Components
Connectors
Power Connectors. The motherboard has two power connectors (JPWR1/2, 48-pin) to get power from the
midplane. At least two power supply modules are required to power up the system.
JPW17-19. These connectors provide power through the midplane to the drives.
CN OCuLink Connectors (CN10-CN17). These connectors are used to connect the PCIe bus from CPUs to
PCIe slots.
H-6153
CS-Storm 500NX Hardware Guide
24
Motherboard Tray
CN10 contains CPU1 PCIe port 1A[3:0]
CN11 contains CPU1 PCIe port 1A[7:4]
CN14 contains CPU1 PCIe port 1A[3:0]
CN15 contains CPU2 PCIe port 1A[7:4]
CN12 contains Slot2 PCIe[3:0]
CN13 contains Slot2 PCIe7:4]
CN16 contains Slot1 PCIe[3:0]
CN17 contains Slot1 PCIe[7:4]
Serial Port Header. A serial port (COM1) is located on the front panel on the motherboard. This connection
provides serial connection support.
Powered SATA DOM (SuperDOM) Connectors. Two powered SATADOM (Device-on- Module) connectors are
located at JSD1/ JSD2 on the motherboard. These connectors are used with Supermicro Super- DOMs, which
are yellow SATA devices with power-pins built in, and can be used to provide backward-compatible power support
to non-Supermicro SATADOMs that require external power supply without the need for separate power cables.
Fan Connectors. This motherboard has three server/CPU fan headers (Fan 1 - Fan 3). These 4-pin fan
connectors are backward compatible with the traditional 3-pin fans. However, fan speed control is available for 4pin fans only. The fan speeds are controlled through IPMI thermal management. Pin assignments: 1 = ground, 2 =
+12V, 3 = tachometer, 4 = power modulation.
TPM/Port 80 Header. A Trusted Platform Module/Port 80 header, located at JTPM1, provides TPM support and
Port 80 connection. This connector is used to enhance system performance and data security.
M.2 Connector. The M.2 slot is designed for mounting internal devices. The motherboard deploys an M key only
dedicated for SSD devices with the ultimate performance capability for native PCI-E SSD support. (M.2 is formerly
known as Next Generation Form Factor [NGFF].)
M.2 specifications:
●
M:2 socket type : Socket 3
●
M.2 PCIe bus width : PCIe Gen2 X4 from PCH
●
M.2 adapter Key : M Key or B+M Key
●
M.2 adapter type : PCI-e SSD only
Jumper Settings
Jumpers can be used to modify the operation of the motherboard. Jumpers create shorts between two pins to
change the function of the connector. Jumpers are identified on the above Component Locations figure; Pin 1 is
identified by a black square behind the pin (square solder pad on the motherboard). Jumper pins are described
below:
CMOS Clear
JBT1 is used to clear the CMOS. Instead of pins, this "jumper" consists of contact pads to
prevent accidental clearing of the CMOS. To clear the CMOS, use a metal object such as a
small screwdriver to touch both pads at the same time to short the connection. For an ATX
power supply, you must completely shut down the system, and then short JBT1 to clear the
CMOS. Clearing the CMOS will also clear all password.
H-6153
CS-Storm 500NX Hardware Guide
25
Motherboard Tray
VGA Enable
Jumper JPG1 allows the user to enable the onboard VGA connector. Pins 1-2 = enabled
(default). Pins 2-3 = disabled.
BMC Enable
Jumper JPB1 is used to enable or disable the embedded baseboard management controller
(BMC) that provides IPMI 2.0/KVM support on the board. Pins 1-2 = BMC enabled (default).
Pins 2-3 = disabled (do not use this setting). Jumper JPB1 is for engineering debugging
only. The manufacturer default setting is on pins 1-2 to enable BMC. Do not change the
default setting. Disabling BMC will disable onboard the graphic controller, hardware
monitoring and system health management.
Manufacturer Mode Select
Close pins 2 and 3 of Jumper JPME2 to bypass SPI flash security and force the system to
operate in the Manufacturer mode. This mode enables the user to flash the system firmware
from a host server for system setting modifications. Pins 1-2 = normal (default). Pins 2-3 =
manufacturer mode.
I2C Bus for VRM
Jumpers JVRM1 and JVRM2 allow the BMC or the PCH to access CPU and memory VRM
controllers. Pins 1-2 = BMC (default). Pins 2-3 = PCH.
BMC Button Function
Jumper JBMC_BTN controls the function of the BMC Button. Connecting pins 1-2 sets the
BMC button to toggle the UID LED. Connecting pins 2-3 sets the BMC button to reset the
BMC chip.
LAN1/LAN2 Enable
Jumper JPL1 allows the user to disable the LAN1 and LAN2 ports. Pins 1-2 = enabled
(default). Pins 2-3 = disabled.
Motherboard Architecture and Features
The architecture of the motherboard is developed around the integrated features and functions of the Intel®
Xeon® E5-2600 v4 processor family, the Intel C612 chipset, and Intel Ethernet 1350 (1 GbE) controller.
The following figure provides an overview of the motherboard architecture, showing the features and
interconnects of the major subsystem components.
H-6153
CS-Storm 500NX Hardware Guide
26
Motherboard Tray
Figure 16. Motherboard Architecture
DDR4 Channel A
DDR4 Channel E
DDR4 Channel B
CPU 1
DDR4 Channel C
QPI 9.6 GT/s
DDR4 Channel F
CPU 2
DDR4 Channel G
QPI 9.6 GT/s
DDR4 Channel H
DMI2/PCIe 2.0 (5 GB/s)
DDR4 Channel D
Midplane
GPU SXM Board
PCIe 3.0 x16
PCIe 3.0 x16
PCIe 3.0 x16 slots
PCIe 3.0 x8
PCIe 3.0 x8
PCIe 3.0 x16
PCIe 3.0 x16
Mini SATA HD
Connectors
PCIe 2.0 x4
S_SATA
sSATA
Ports 0-3
SATA3 6.0 Gb/s
SATA
Ports 0-3
SATA3 6.0 Gb/s
i_SATA
LAN1
1.0 Gb/s
LAN2
Dedicated
IPMI LAN
BMC
AST2400
VGA
SATA
Ports 4-5
SATA3 6.0 Gb/s
1.0 Gb/s
1.0 Gb/s
PCIe 2.0 x1
#5
#4
SATA DOM
I350
Ethernet
Controller
PCH
612
#3
#2
#1
#0
i_SATA
Mini SATA HD
Connectors
#9
#8
#7
#6
DB-15
Internal
COM serial port
Internal
USB
USB 3.0 (Port 0)
USB 3.0
USB 3.0 (Port 1)
BIOS Flash
(128 MB)
USB
Ports
SPI
Processor and Chipset Overview
Built upon the functionality and capability of the dual processors (Socket R3) and the Intel PCH C612, the
motherboard provides the best balanced solution of performance, power-efficiency, and features to address the
diverse needs of next-generation computer users.
With support of new Intel Microarchitecture 22nm Processing Technology, the motherboard dramatically increases
compute performance and features faster, more secure encryption for a multitude of server applications. This
platform offers maximum I/O expandability, energy efficiency, and data reliability.
The PCH C612 chip provides Enterprise SMbus and MCTP support with the following features:
●
DDR4 288-pin memory support on Socket R3
●
Support for MCTP protocol and ME
●
Intel® Advanced Vector Extensions 2.0
●
Improved I/O capabilities to high-storage-capacity configurations
●
Low-power, high-reliability thermal profile processor options
H-6153
CS-Storm 500NX Hardware Guide
27
Motherboard Tray
●
SPI Enhancements
●
Intel Node Manager 3.0
●
Intel Virtualization Technology for Directed I/O (Intel VT-d)
●
BMC supports remote management, virtualization, and the security package for enterprise platforms
Special Features
Recovery from AC Power Loss
The Basic I/O System (BIOS) provides a setting that determines how the system responds
when AC power is lost and then restored. The BIOS can be set for the system to remain
powered off (in which case you must press the power switch to turn it back on), or for it to
automatically return to the power-on state. See the Advanced BIOS Setup section for this
setting. The default setting is Last State.
System Health Monitoring
The BMC on the motherboard monitors system health. An onboard voltage monitor
continually scans the following onboard voltages: VCCP0, VCCP1, VDDQAB/CD/EF/GH,
+1.5V_PCH, +1.05V_PCH, +1.2V_BMC, +12V_STBY, +3.3V, +5V, +3.3V standby, +5V
standby, VBAT, CPU, memory, PCH temperature, system temperature, and memory
temperature. If a voltage becomes unstable, a warning is given, or an error message is sent
to the screen. The voltage thresholds can be adjusted to define the sensitivity of the voltage
monitor.
Temperature Control
A thermal control sensor monitors the CPU temperatures in real time and turns on the
onboard cooling fans whenever the CPU temperature exceeds a user-defined threshold to
prevent the CPU from overheating. The thermal control sensor also sends warning
messages to alert the user when the chassis temperature is too high.
The system health monitoring support provided by the BMC controller can check the RPM
status of a cooling fan. The speed of the CPU heatsink fan, rear power supply fan, and fan
module are controlled by IPMI Thermal Management.
Memory Support and Population
The motherboard supports up to 1536 GB of LRDIMM (Load Reduced) and RDIMM (Registered) ECC/Non-ECC
DDR4 (288-pin) memory at up to 2400MT/s in 24 DIMM slots.
Memory speed is dependent on the CPUs. For best memory performance, install memory modules of the same
type and same speed. Mixing of DIMMs of different types or different speeds is not allowed.
Install DIMMs in pairs, that is, use an even number of DIMMs. All channels in a system will run at the fastest
common frequency. Refer to the following figure and table.
H-6153
CS-Storm 500NX Hardware Guide
28
Motherboard Tray
CPU 2
CPU 1
B3
B2
B1
C1
C2
C3
D1
D2
D3
A3
A2
A1
F3
F2
F1
G1
G2
G3
E3
E2
E1
H1
H2
H3
Figure 17. DIMM Slot Identification
Table 3. Processor and Memory Module Population for Optimal Performance
Number of CPUs + DIMMs CPU and Memory Population Configuration Table (For memory to work
properly, please follow the instructions below.)
1 CPU & 2 DIMMs
CPU1
DIMMs A1, B1
1 CPU & 4 DIMMs
CPU1
DIMMs A1, B1, C1, D1
1 CPU & 5-8 DIMMs
CPU1
DIMMs A1, B1, C1, D1 and any pairs in A2|B2|C2|D2 slots
2 CPUs & 4 DIMMs
CPU1 + CPU2
DIMMs A1, B1, E1, F1
2 CPUs & 6 DIMMs
CPU1 + CPU2
DIMMs A1, B1, C1, D1, E1, F1
2 CPUs & 8 DIMMs
CPU1 + CPU2
DIMMs A1, B1, C1, D1, E1, F1, G1, H1
2 CPUs & 10-14 DIMMs
CPU1 + CPU2
DIMMs A1, B1, C1, D1, E1, F1, G1, H1 and any pairs in CPU1 and CPU2 DIMM
slots.
2 CPUs & 16 DIMMs
H-6153
CPU1 + CPU2
CS-Storm 500NX Hardware Guide
29
Motherboard Tray
Number of CPUs + DIMMs CPU and Memory Population Configuration Table (For memory to work
properly, please follow the instructions below.)
DIMMs A1, B1, C1, D1, E1, F1, G1, H1 and
A2, B2, C2, D2, E2, F2, G2, H2
2 CPUs & 16-24 DIMMs
CPU1 + CPU2
DIMMs, fill in this order: A1, B1, C1, D1, E1, F1, G1, H1 and
A2, B2, C2, D2, E2, F2, G2, H2 then in pairs
A3|B3, C3,|D3, E3|F3, G3|H3
Table 4. Populating RDIMM/LRDIMM DDR4 Memory
Speed (MT/s); Voltage (V); Slots per Channel (SPC)
and DIMMs per Channel (DPC)
Type
Ranks
Per
DIMM
and
Data
Width
DIMM
Capacity
3 Slots per Channel
1 DPC
(GB)
2 DPC
3 DPC
E5-2600 V3 E5-2600 V4 E5-2600 V3 E5-2600 V4 E5-2600 V3 E5-2600 V4
4 Gb
8 Gb
1.2 V
1.2 V
1.2 V
1.2 V
1.2 V
1.2 V
RDIMM
SRx4
8 GB
16 GB
2133
2400
1866
2133
1600
1600
RDIMM
SRx8
4 GB
8 GB
2133
2400
1866
2133
1600
1600
RDIMM
DRx8
8 GB
16 GB
2133
2400
1866
2133
1600
1600
RDIMM
DRx4
16 GB
32 GB
2133
2400
1866
2133
1600
1600
LRDIMM
QRx4
32 GB
64 GB
2133
2400
2133
2400
1600
1866
LRDIMM
3DS
8Rx4
64 GB
128 GB
2133
2400
2133
2400
1600
1866
H-6153
CS-Storm 500NX Hardware Guide
30
Rack Power and Server Configuration
Rack Power and Server Configuration
The information in this section is provided to assist Cray customers with identifying rack enclosure, power
requirement, and PDU options for configuring a rack of CS-Storm 500NX servers.
Rack Space
The following table lists the four Cray approved American Power Conversion (APC™) NetShelter SX enclosures.
Rack Enclosure
AR3300
AR3307
42 U
48 U
42 U
48 U
Width
600 mm
(23.6 in)
600 mm
(23.6 in)
750 mm
(29.5 in)
750 mm
(29.5 in)
Depth
1200 mm
(47.2 in)
1200 mm
(47.2 in)
1200 mm
(47.2 in)
1200 mm
(47.2 in)
10
12
10
12
Switches
2
0
2
0
PDUs (maximum)
2
2
4
4
Height (rack unit)
CS-Storm 500NX
servers (maximum)
AR3350
AR3357
Total PDU Power
208 V
220 V
230 V
240 V
208 V
220 V
230 V
240 V
Delta/Wye
Amperage
Delta
60
21.6
22.9
23.9
24.9
17.3
18.3
19.1
20.0
Wye
30
18.7
19.8
20.7
21.6
15.0
15.8
16.6
17.3
Wye
32
20.0
21.1
22.1
23.0
16.0
16.9
17.7
18.4
Wye
60
37.4
39.6
41.4
43.2
30.0
31.7
33.1
34.6
Wye
63
39.3
41.6
43.5
45.4
31.4
33.3
34.8
36.3
H-6153
Maximum (Nominal) kW
80% UL Degraded (Usable) kW
CS-Storm 500NX Hardware Guide
31
Rack Power and Server Configuration
Usable Branch Power vs. Input Voltage
Branch Circuit Breaker 20A
Usable Power
(80% UL -> 16 A)
208 V
220 V
230 V
240 V
3328
3520
3680
3840
CS-Storm 500NX Power Consumption
Maximum Power Consumption
3600 W
(Fully Configured)
Motivair® ChilledDoor® Configuration
Required System
AC Power - 3600W
ChilledDoor
Model
Nominal
Power
Usable
Power
Racks - APC
Motivair ChilledDoor
Supported 500NX Servers
M8
20-29kW
25kW
42/48U
6.9
M12
30-45kW
35kW
42/48U
9.7
M14 (TBD)
55kW
50kW (TBD)
42/48U
13.9
PDU Configuration
Rack power consumption data for different redundancy configurations along with the maximum number of
supported CS-Storm 500NX servers are shown in the following three tables. The table below is a key that applies
to these three tables.
H-6153
CS-Storm 500NX Hardware Guide
32
Rack Power and Server Configuration
PDU 1+1 Redundancy and PSU 2+2 Redundancy
H-6153
CS-Storm 500NX Hardware Guide
33
Rack Power and Server Configuration
No PDU Redundancy, But PSU 2+2 Redundancy
H-6153
CS-Storm 500NX Hardware Guide
34
Rack Power and Server Configuration
No PDU Redundancy, But PSU 3+1 Redundancy
H-6153
CS-Storm 500NX Hardware Guide
35
Rack Power and Server Configuration
600 mm Wide Rack Setup
CS-Storm 500NX rail depth - posts inner space: 28~33.75"
1. Move the front rail support posts 3.5 inches inwards. Set the post-to-post distance to 28.5~29 inches.
2. Add a vertical cable management bracket to the front side of the rack. (Cray P/N: 101803000).
http://www.apc.com/shop/us/en/products/Narrow-Vertical-Cable-Organizer-NetShelter-SX-42U/P-AR7511
3. Relocate one set of regular cable management brackets to the middle of the rack, and adjust the remaining
cable management brackets to the back end of the servers.
4. Add the APC Air Recirculation Prevention kit to prevent air-recirculation after the front rail support posts are
moved inwards. (Cray P/N: 200-00124A )
NetShelter® SX Air Recirculation Prevention Kit—AR7708: http://www.apc.com/salestools/ASTE-6Z6JYP/
ASTE-6Z6JYP_R0_EN.pdf
5. Another cable passage from the server front to switches is needed through a 1U panel (AR8429 — 1U cable
pass-through manager with brush strip) when the servers do not fully occupy the U space. (Cray P/N:
004-01329A ).
http://www.apc.com/shop/vn/en/products/Horizontal-Cable-Organizer-1U-w-brush-strip/P-AR8429?
isCurrentSite=true
750 mm Wide Rack Setup
CS-Storm 500NX rail depth - posts inner space: 28~33.75 inches.
1. Move front rail support posts by 4.5 inches inwards, and set post-to-post distance to 28.5 inches.
2. Move one set of regular cable management brackets to the front and adjust the position of the remaining
cable management bracket.
3. Add the APC Air Recirculation Prevention kit to prevent air-recirculation after the front rail support posts are
moved inwards. (Cray P/N: 200-00124A ).
http://www.apc.com/salestools/ASTE-6Z6JYP/ASTE-6Z6JYP_R0_EN.pdf
4. Another cable passage from the server front to switches is needed through a 1U panel (AR8429 — 1U cable
pass-through manager with brush strip) when the servers do not fully occupy the U space. (Cray P/N:
004-01329A ).
http://www.apc.com/shop/vn/en/products/Horizontal-Cable-Organizer-1U-w-brush-strip/P-AR8429?
isCurrentSite=true
H-6153
CS-Storm 500NX Hardware Guide
36
Replacement Procedures
Replacement Procedures
Instructions on removing and installing main system components are included in this section.
The only tool you will need to install components and perform maintenance is an adjustable torque driver with a
Philips head. Torque specifications are provided below and within procedures.
Torque Specifications
Motherboard:
6-8 lbf-in
Processor heatsink:
No more than 4.3 lbf-in
SXM board
6-8 lbf-in
GPU
6-8 lbf-in
GPU heatsink
No more than 4.3 lbf-in
ESD Precautions
Observe electrostatic discharge (ESD) precautions during the entire removal and installation process. Failure to
do so can result in equipment damage. Required apparel includes an ESD smock, ESD shoes, and an ESD wrist
strap.
Safety Precautions
Personnel handing this equipment must be trained to follow these instructions. They are responsible for
determining if additional requirements are necessary under applicable workplace safety laws and regulations.
Electrical safety precautions should be followed to protect personnel from harm and the equipment from damage.
Power should always be disconnected when removing or installing server chassis and main system components.
Server Lift Setup
Prerequisites
●
Time: 10 minutes
●
Tools: Server Lift
About this task
The server lift may require assembly after removing it from a shipping container.
H-6153
CS-Storm 500NX Hardware Guide
37
Replacement Procedures
Procedure
1. Position the lift horizontally on the ground.
CAUTION:
●
Personal or equipment injury.
●
Two people are required to perform this procedure. Failure to assemble the server lift with two
people could cause personal injury or equipment damage.
2. Slide each leg into a base socket until the leg lock pin snaps into the leg.
Figure 18. Server Lift Components
Winch assembly
Carriage lock pins
Fork support tubes
Ladder
Forks
Server platform
Base socket
Leg
3. Stand the lift back up vertically.
4. Slide each fork onto the fork support tubes and secure each side with two lock pins.
H-6153
CS-Storm 500NX Hardware Guide
38
Replacement Procedures
5. Position the server platform onto the forks. The top-back of the platform fits behind the lower fork support
tube. No tools or hardware are needed.
Note: An earlier lift model included a three-sided server platform that slid onto the forks and a cable and
carabiner held it in place.
6. Remove and reverse the winch handle so the handle grip faces the operator. Verify the winch handle pin locks
the handle in place.
Tray Removal and Installation
First, power down the server and disconnect all cables.
Removal
1. Push the gray release latch on each side toward the center of the tray.
2. While holding the gray latches in, push down the red seating lever on each side until the tray unseats from the
chassis.
3. Grasp the red handles and gently begin to pull the tray out of the chassis.
4. The tray stops when the release latches engage. Push the tray release button on each side to release the
tray.
5. Hold the tray securely and remove the tray from the chassis.
H-6153
CS-Storm 500NX Hardware Guide
39
Replacement Procedures
Installation
1. Slide the tray into the chassis, and carefully push it in until it seats with the midplane connectors.
2. Grasp the red seating levers and push them up until the tray locks in place.
Tray Cover Removal
The motherboard tray and the GPU tray each have a cover that can be removed to access internal components.
CAUTION:
●
Proper airflow and cooling
●
Do not operate the server without the covers for both motherboard and GPU trays. These covers
must be in place to allow proper airflow and prevent overheating.
To Remove a Tray Cover:
Push the release buttons on both sides of the cover toward the middle of the tray and rotate the cover up. Then lift
the cover off the tray.
H-6153
CS-Storm 500NX Hardware Guide
40
Replacement Procedures
To Install a Tray Cover:
Lower the cover onto the pin on each side of the chassis. Then rotate the cover down until the release buttons
snap into the latches on the sides of the chassis.
H-6153
CS-Storm 500NX Hardware Guide
41
Replacement Procedures
GPU Module and Heatsink Replacement
About this task
This procedure describes how to replace a passive heatsink and GPU from the CS-Storm 500NX chassis.
●
●
Tools:
○
Adjustable torque driver
○
Plastic putty knife
Supplies:
○
Isopropyl alcohol, 90% or higher concentration
○
Clean, lint-free cloth
○
Thermal paste/grease (100117500: Thermal grease G-751, 1.5 g syringe [or equivalent])
Procedure
Remove Heatsink
The heatsink is attached to the GPU with captive fasteners. Loosen the four screws following a diagonal
pattern.
1. Remove the GPU tray from the server chassis, and remove the cover.
2. Use an adjustable torque driver, or screwdriver, to loosen the four captive screws in the heatsink. Follow the
order shown below (1, 2, 3, 4).
H-6153
CS-Storm 500NX Hardware Guide
42
Replacement Procedures
3. Hold the heatsink and gently wiggle it to loosen it from the GPU. (Do not use excessive force!)
4. Once the heatsink is loosened, lift it straight up to remove it from the GPU module.
Clean Heatsink
5. Use a plastic putty knife to remove excess thermal interface material (paste/grease) from the heatsink.
6. Clean the heatsink with a clean, lint-free cloth dampened with isopropyl alcohol. Make sure the cloth is not
dripping. Do not use paper towel as it will leave lint behind.
Plastic putty knife
Clean, lint-free cloth
dampened with
isopropyl alcohol
Thermal pad
7. Clean the GP100 GPU with a new cloth dampened with isopropyl alcohol. Take your time to prevent any
alcohol or thermal paste debris from contaminating the GPU module.
8. Use a dry cloth to dry the GP100 GPU and heatsink so they are ready for a new application of thermal paste.
H-6153
CS-Storm 500NX Hardware Guide
43
Replacement Procedures
Remove GPU Module
9. Loosen the center captive screws of the GPU module following the order (1-4) shown below.
10. Start with the first screw and loosen it by giving it two rotations. Loosen the outer four screws (5-8).
11. Proceed to loosen the remaining inner screws (2, 3, and 4) by giving each two rotations until screws 1-4 are
completely loosened
12. Repeat this process by loosening the outer screws (5, 6, 7, and 8).
13. Hold the GPU module by the narrow sides and gently wiggle it to loosen it from the socket/GPU SXM board.
(Do not use excessive force!)
14. Once the GPU module is loosened, lift it straight up to remove it from the tray.
Loosen 1-4 first,
then loosen 5-8
Install Replacement GPU Module
15. Align the guide pins of the GPU module with the holes over the socket/SXM connectors. The guide pins
ensure installation in the correct orientation.
16. Secure the GPU module by reversing the above steps. Screw the four inside screws first in a diagonal
pattern, then the four outside screws in a diagonal pattern.
Torque each screw to 6-8 lbf-in.
H-6153
CS-Storm 500NX Hardware Guide
44
Replacement Procedures
Tighten 1-4 first,
then tighten 5-8.
Torque to 6-8 lbf-in.
Guide pin
Guide pin hole
Install GPU Heatsink
17. Apply thermal grease to the top of the GP100 GPU as shown.
Apply 5 small dabs/drops of thermal grease in an X pattern. Do not spread the grease. (If using a new
heatsink, remove the protective film covering the thermal interface material on the bottom side of the heatsink.
Also remove the blue paper covering the thermal pads; no grease is needed for this situation.)
18. Check the thermal pad on each end of the GPU heatsink. Each thermal pad should be in good condition and
be adhered to a raised strip on each end of the heatsink.
19. Position the heatsink over the GPU so that the heatsink part number faces the right-front side of the GPU tray.
H-6153
CS-Storm 500NX Hardware Guide
45
Replacement Procedures
20. Tighten the four captive screws in the heatsink. Turn each two rotations at a time. Follow the order shown
below (1, 2, 3, 4).
21. Torque each screw no more than 4.3 lbf-in.
Torque: No more than 4.3 lbf-in
H-6153
CS-Storm 500NX Hardware Guide
46
Replacement Procedures
Motherboard Processor and Heatsink Replacement
Prerequisites
●
The motherboard tray is removed from the chassis and is placed on a stable ESD-safe work surface.
●
Tools:
●
○
Adjustable torque driver, 5–40 in-lb
○
Vacuum pen
○
Plastic putty knife
Supplies:
○
Isopropyl alcohol, 90% or higher concentration
○
Clean, lint-free cloth
○
Thermal paste/grease
About this task
This procedure describes how to replace an active heatsink and processor.
Motherboards are not shipped with processors installed. Processors, heatsinks, and memory DIMMs must be
removed from the defective motherboard and installed on the replacement motherboard.
Procedure
Remove Heatsink
The heatsink is attached to the processor socket with captive fasteners. Use an adjustable torque driver to
loosen four screws located on the heatsink corners.
1. Unplug the heatsink/fan power cord from the motherboard.
2. Start with one screw (A) and loosen it by giving it two rotations. Refer to the following figure.
3. Proceed to loosen the remaining screws (B, C, and D) by giving each two rotations. Repeat this process by
loosening each screw two rotations, each time, until all screws are loosened.
4. Lift the heatsink straight up.
H-6153
CS-Storm 500NX Hardware Guide
47
Replacement Procedures
Clean Heatsink and Processor
5. Use a plastic putty knife to remove excess thermal interface material (paste/grease) from the heatsink.
6. Clean the heatsink with a clean, lint-free cloth dampened with isopropyl alcohol. Make sure the cloth is not
dripping. Do not use paper towel as it will leave lint behind.
Plastic putty knife
H-6153
CS-Storm 500NX Hardware Guide
Clean, lint-free cloth
dampened with
isopropyl alcohol
48
Replacement Procedures
7. Clean the processor with a new cloth dampened with isopropyl alcohol. Take your time to prevent any alcohol
or thermal paste debris from contaminating the processor socket.
8. Use a dry cloth to dry the processor and heatsink so they are ready for a new application of thermal paste.
Remove Processor
9. Unlatch the processor load plate as shown in the following figure.
a. First, release the lever handle marked with the "Unlock (1)" symbol
b. Next, release the second lever handle.
10. Open the load plate.
a. Lift the load plate.
b. For a replacement motherboard, open the latch taking care not to touch any of the pins inside the socket.
c.
Remove the socket cover from the load plate by pressing it out.
Replacement motherboards do not include
processors. There should be a soc ket cover (C).
Save and reuse the socket cover if the
processor needs to be removed in the
future.
Socket with processor.
11. Remove the processor. Cray Service recommends using a vacuum pen to install and remove processors.
a. Carefully lift the processor out of the socket. DO NOT drop the processor on the socket pins.
H-6153
CS-Storm 500NX Hardware Guide
49
Replacement Procedures
b. Immediately place your other hand underneath the processor to protect the socket pins. Place the
processor on an ESD-safe work surface.
Immediately place hand under
processor to protect socket pins
Install Replacement Processor
12. Install the replacement processor.
a. If necessary, remove the processor from its packaging. Carefully remove the protective cover from the
bottom side of the processor, taking care not to touch any contacts.
b. Orient the processor with the socket so that the processor cutouts match the four orientation posts on the
socket.
c.
Note the location of a gold key at the corner of the processor.
d. Carefully place (DO NOT drop) the processor into the socket. Hold the processor down with your finger as
you release the vacuum pencil.
H-6153
CS-Storm 500NX Hardware Guide
50
Replacement Procedures
13. Close the load plate. Carefully lower the load plate down over the processor.
14. Lock the load plate.
a. Push down on the locking lever marked with the "Lock (1)" symbol.
b. Slide the tip of the lever under the notch in the load plate. Make sure the load plate tab engages under
the socket lever when fully closed.
c.
Repeat the steps to latch the locking lever on the other side. Latch the levers in the order shown.
H-6153
CS-Storm 500NX Hardware Guide
51
Replacement Procedures
Install Heatsink
15. Apply thermal grease to the top of the processor as shown. Apply 5 small dabs/drops of thermal grease in an
X pattern. Do not spread the grease. (If using a new heatsink, remove the protective film covering the thermal
interface material on the bottom side of the heat sink; no grease is needed for this situation.)
16. Attach the heatsink. Position the heatsink fins in the proper orientation.
a. Start with screw A and engage the screw threads by giving it two rotations. (Do not fully tighten.)
b. Proceed to screw B and engage screw threads by giving it two rotations. Continue by engaging screws C
and D.
c.
Continue by giving each screw, using the same pattern, two rotations each time until each screw is lightly
tightened. Torque each screw to 8 inch-lb.
H-6153
CS-Storm 500NX Hardware Guide
52
Replacement Procedures
17. Connect the heatsink/fan power plug to the motherboard.
DIMM Replacement
About this task
This procedure describes how to replace DIMM modules.
Motherboards may not ship with processors or DIMMs installed. Processors, heatsinks, and DIMMs must be
removed from the defective motherboard and installed on the replacement motherboard.
Procedure
Remove DIMM
1. Press down on the DIMM latches to unseat the DIMM from the DIMM socket. The DIMM lifts from the socket.
2. Holding the DIMM by the edges, lift it straight up from the socket and set the DIMM on an ESD-safe work
surface. If the DIMM will not be reinstalled soon, store it in an anti-static package.
H-6153
CS-Storm 500NX Hardware Guide
53
Replacement Procedures
Install DIMM
3. Make sure the clips at either end of the DIMM socket are pushed outward to the open position.
4. Holding the DIMM by the edges, remove it from its anti-static package. Position the DIMM above the socket.
Align the notch on the bottom edge of the DIMM with the key in the DIMM socket.
5. Insert the bottom edge of the DIMM into the socket (A). When the DIMM is inserted, push down on the top
edge of the DIMM until the retaining clips snap into place (B). Make sure the clips are firmly in place (C).
H-6153
CS-Storm 500NX Hardware Guide
54
Replacement Procedures
Drive Removal
About this task
The drives are mounted in drive carriers to simplify their installation and removal from the chassis. System power
may remain on when removing carriers with drives installed. These carriers also help promote proper airflow for
the drive bays. For this reason, even empty carriers without drives installed must remain in the chassis
Procedure
1. Press the release button on the drive carrier. This extends the drive carrier handle
2. Use the handle to pull the drive and its carrier out of the chassis.
H-6153
CS-Storm 500NX Hardware Guide
55
Replacement Procedures
Drive Installation
Procedure
1. Remove the dummy drive from the carrier by removing the screws.
2. Insert a drive into the carrier with the PCB side facing down and the connector end toward the rear of the
carrier. Align the drive in the carrier so that the screw holes line up. There are holes in the carrier marked
“SATA” to aid in correct installation.
3. Secure the drive to the carrier with the screws.
4. Insert the drive carrier into its bay with the carrier release handle on the top and the release button on the
bottom. When the carrier reaches the rear of the bay, the release handle will retract.
5. Push the handle in until it clicks into its locked position.
H-6153
CS-Storm 500NX Hardware Guide
56
Replacement Procedures
Fan and Power Supply Replacement
About this task
The chassis contains eight 9-cm exhaust fans that provide cooling for the system. Four of these fans are
combined with the four hot-plug power supplies. There is no need to power down the system when switching fans
or power supplies.
In the event of a power supply failure, the remaining power modules automatically take over. The failed power
module can be replaced without powering-down the system. Replace modules with the same model.
An amber light on the power supply is illuminated when the power is switched off. A green light indicates that the
power supply is operating.
Procedure
Changing a System Fan or Power Supply
1. Determine which fan/power supply has failed using IPMI or observation.
2. For a power supply, disconnect the power cord.
3. Lift the locking lever and pull the fan/power supply module from the housing/bay.
4. Push the new fan/power supply module into the bay until it clicks into place. Press the locking lever down to
ensure the module is seated.
5. For a power supply, plug the AC power cord back into the power supply module.
H-6153
CS-Storm 500NX Hardware Guide
57
Replacement Procedures
H-6153
CS-Storm 500NX Hardware Guide
58
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertising