With Mosaic - GPU Technology Conference
S0341 - See the Big Picture
Scalable Visualization Solutions for System Integrators
Doug Traill - [email protected] /[email protected]
SVS Solutions
MOSAIC
GSync
Three (3) things that I want you learn
 MOSAIC – Application Scalability
 Synchronization – Focus to on the image and not the
artifacts
artifects
 Visual Acuity – ultra high resolution “retina” displays.
Quadro Features for System Integrators
MOSAIC Technologies
 Without Mosaic: 4 Independent Displays
 With Mosaic: Single Unified Desktop & Taskbar
Mosaic Features
Scale with Quadro and NVS Solutions
Key Features
• Easy Configuration
• Unified Desktop (up to 8 display devices*)
• Application Spanning
• Taskbar Spanning
• Bezel Correction
• Windows 7 + Linux Support
* All displays require matching timings and resolution
Premium Mosaic Features
Available with high-end Quadro solutions
Additional Premium Features
Single or Dual Quadro Plex
• Seamless Display
• Projector Overlap
• Stereo Support
• Quadro G-Sync Support
• Linux and Windows Vista, XP and 7 Support
• NEW API Support for Warp + Intensity Correction
Single or SLI:
Quadro 5000, 6000
NV-WARP – Warp + Intensity API
Wednesday Room A1 – 10.00am Warping + Blending for Seamless Displays
Image courtesy of Joachim Tesch
- Max Planck Institute for Biological Cybernetics
SDK – Available to
Registered Developers
3rd party applications




 Full Auto-calibration system
 Premium MOSAIC support
 Win 7 only
Sample SDK
Three function calls
NVAPI
Win7 only
Certified Platforms for Dual QUADRO 5000/6000
Premium MOSAIC
HP Z800/Z820
Dual Quadro5000/6000
Dell T7500
Dual Quadro5000/6000
Lenovo D20/C20
Dual Quadro5000/6000
Fujitsu R670/R570
Dual Quadro5000/6000
http://www.nvidia.com/object/quadro_sli_compatible_systems.html
Certified Quadro Plex Platforms
 Most workstation/server class platforms support single
Quadro Plex
 Most can support Dual Quadro Plex
 Test suite for system builders to certify Quadro Plex.
http://www.nvidia.com/page/quadroplex_certified_platforms.html
Differences between Premium Mosaic + Mosaic
 Frame Synchronization
— Vertical Sync – to a common timing - without a physical connection between
cards there is no method for having a common sync
 Effect is tearing
— Stereo
 Without frame sync don’t have method for sync left/right eye between GPUS
— Overlap
 Without frame sync tearing would be most noticeable in a blend region.
 We disable this feature so tearing is not shown.
No Frame Sync
t0
t0+t1
t0
GPU 0 - Display 0
t0
GPU - Display 1
t0+t1
GPU 0 - Display 0
GPU - Display 1
t0 + t1
• Vertical Sync is the pulse that indicates the start of the display refresh.
• To avoid tearing on a single screen the application swap buffers are
synced to vertical sync.
• Although all four displays may have the same refresh rate – vertical sync
start between 2 GPUs will be different.
• This can result in tearing between displays.
Frame Sync – on SLI Mosaic
t0
t0
t0
GPU 0 - Display 0
GPU - Display 1
t0
GPU 0 - Display 0
t0
GPU - Display 1
t0
• Framelock provides a common sync signal between graphics cards to insure the
vertical sync pulse starts at a common start.
• This is commonly referred to as Frame Synchronization
• On SLI Mosaic in a workstation – Framelock signal is provided across the SLI Bridge.
• Between Dual Quadro Plex’s framelock signal is provided between the CAT5 cable
Let the OS manage multiple displays
Displays
(1) Rendering occurs on
one GPU
GPU
GPU
(2) Pixels are copied across PCIe
bus to the other GPU for display
App
Let the Application manage multiple
displays
Displays
(1) Rendering occurs on
one GPU
GPU
GPU
(2) Pixels are copied across PCIe
bus to the other GPU for display
App
Application with GPU Affinity
Wednesday 9.00am Programming Multi-GPUs for Scalable Rendering
Displays
GSyncII
card
GPU
GPU
GPU
Affinity
GPU
Affinity
App
Application needs to be multi-threaded
(4 Draw threads)
GsyncII Card needed for framelock
Needs be programmed using GPU Affinity (nvidia
extensions) for Max performance
Application should use NV swap groups
to sync swap buffer between GPUs
MOSAIC – hides the complexity from the
application
In MOSAIC mode driver works in Broadcast mode to GPUs
NVIDIA Control Panel
Order in which commands
are applied can matter
(1) Manage 3D Settings
Profile
Stereo
Vsync etc
(2) Set Resolution
(3) Set MOSAIC and/or
Synchronization
Configure Mosaic
Understanding Topologies
 MOSAIC uses Grids to Topology
 Grid is numbered by TOP ROW – left to right
columns
columns
rows
1
2
3
4
1
2
3
4
rows
Port numbers – QuadroPlex 7000
Amber LED appears at POST
GPU 1
1
0
GPU 0
1
0
GPU 0
1
0
GPU 1
1
0
Amber LED indicates the primary GPU (0)
Right hand port = is the primary port (0)
We can describe each port by (GPU,Port) number
Relating Ports to Grid
1
1,1
1,0
0,1
0,0
2
0,0
1,0
3
0,1
1,1
4
configureMosaic.exe set rows=2 cols=2
configureMosaic.exe set rows=2 cols=2 out=0,0 out=0,1 out=1,0 out=1,1
1
2
3
4
1
1
3
2
0,1
1x4 Grid
0,0
2
4
1,0
0,0
0,1
1,0 4
2x2 Grid
1,1
1,1
3
configureMosaic.exe set rows=1 cols=4
configureMosaic.exe set rows=2 cols=2
1
1
2
0,0
0,0
1
3
2
1,0
1x2 Grid
configureMosaic.exe set rows=1 cols=2
2
1,0
2x1 Grid
configureMosaic.exe set rows=2 cols=1
0,0
0,1
1x3 Grid
configureMosaic.exe set rows=1 cols=3
1,0
Passive Stereo
R
L
Left Eye
1
Right Eye
3D Settings
1
2
Right Eye
Left Eye
1
1,0
0,0
0,0
2
0,1
1,1
1x2 Grid
configureMosaic.exe set rows=1 cols=2 passivestereo
1
2
1,0
0,1
2
1,1
2x1 Grid
configureMosaic.exe set rows=2 cols=1 passivestereo
Port layout for SLI workstation
0,0
0,1
Master - PCI Slot 2
Blank
1,0
1,1
Only two connections per GPU !
Layout for HP Z800 – other workstations may vary
PCI Slot 4
Port layout for SLI workstation
Verifying outputs
0,0
0,1
only 0,0 on
configuremosaic set rows=1, cols=1 out=0,0
only 0,1 on
configuremosaic set rows=1, cols=1 out=0,1
only 1,0 on
configuremosaic set rows=1, cols=1 out=1,0
1,0
1,1
Only two connections per GPU !
Layout for HP Z800 – other workstations may vary
only 1,1 on
configuremosaic set rows=1, cols=1 out=1,1
Port layout for SLI workstation
DVI port is always primary on card – if used !
0,0
1,0
0,1
1,1
Only two connections per GPU !
Layout for HP Z800 – other workstations may vary
Dual Quadro Plex
Secondary
Primary
GPU 3
3,1
3,0
GPU 2
2,1
2,0
GPU 1
1,1
1,0
GPU 0
0,1
0,0
DHIC
• DHIC required for SLI Mosaic > 4 displays
• Amber LED – indicates master
• Framelock
• RJ45 between GsyncII cards
3
2
1
0,0
0,1
2,0
6
5
4
8
7
1,0
2,1
3,0
1,1
3,1
Nvidia Control Panel 2x4 Grid
configureMosaic set rows=2 cols=4 out=0,0 out=0,1 out=2,0 out=2,1 out=1,0 out=1,1 out=3,0 out=3,1
3
2
1
0,0
0,1
6
5
2,0
4
1,0
8
7
2,1
2x4 Grid
configureMosaic set rows=2 cols=4
1,1
3,0
3,1
2 Channel Overlap
180 pixel overlap
configureMosaic.exe set rows=1 cols=2 overlap=180,0
Blending 4K Projectors
0 pixel
overlap
180 pixel
overlap
0 pixel
overlap
configureMosaic.exe set rows=2 cols=4 overlapcol=0,180,0
Portrait Mode – Win 7 only
2
1
0,0
4
3
0,1
2,0
2,1
configureMosaic set rows=1 cols=4 rotate=90
Valid Rotate values
90
180
270
MOSAIC + 1 – setting up multiple GRIDS
1
2
0,1
FX1800 Display
3
4
configureMosaic set rows=2 cols=2 nextgrid rows=1 cols=1
Note: only 1 grid can be across multiple GPUs
configureMosaic set rows=2 cols=2 nextgrid rows=1 cols=1
The first grid set is the primary
configureMosaic set rows=2 cols=2 rotate=90
nextgrid
rows=1
cols=1
nextgrid rows=1
cols=1
rotate=90
Win 7 – Driver Profiles
 Set Default 3D settings for profile
 Sets Driver Optimization
 Generic + ISV Types
— 3D App – Visual Simulation
— 3D App – Video Editing
— Autodesk Motion Builder
—
Dassault System CATIA
— etc.
Common Profiles
3D App – Game Development
Turns card into Geforce card
Good for DirectX Games
3D App – Modeling AFR
CAD/3D modeling type applications
Support for SLI Alternate frame
rendering
3D App – Video Editing
Optimization for video playback &
editing
Eliminates video tearing
3D App - Visual Simulation
Optimizes OpenGL pipeline for Viz Sim
Applications
Good for applications wanting fixed fps –
i.e. 60fps
No Quad-buffered stereo support
Workstation Dynamic Streaming
Applications using GSync
Applications wanting fixed fps.
Quad-buffered stereo suport.
Performance Hit for Multiple Displays
Viewperf 10.0
1
0.9
0.8
0.7
0.6
1 screen
0.5
4 screens
8 screens
0.4
0.3
0.2
0.1
0
3dsmax-04
catia-02
ensight-03
maya-02
proe-04
sw-01
tcvis-01
ugnx-01
SLI Mosaic Performance Advantage
Viewperf 10.0
1.2
1
0.8
1 screen
0.6
4 screens, Mosaic
8 screens, Mosaic
0.4
0.2
0
3dsmax-04
catia-02
ensight-03
maya-02
proe-04
sw-01
tcvis-01
ugnx-01
MOSAIC Performance Enhancements
Multi-GPUs (does not work on
single GPU)
30
Pixel Fill limited apps
MOSAIC uses a lot of fill
Pixel Fill = Screen size – larger screen more fill
60
If you shrink the window and performance
improves the app is fill limited
#mosaic
MOSAIC Performance Enhancements
Scissor clip function
Best for full screen apps
If you drag windows around you
will see distortion.
To enable
enable_Mosaic_Clip_To_Subdev.exe
To disable
disable_Mosaic_Clip_To_Subdev.exe
Improves fill performance on MOSAIC – Performance Gain will vary by Application
email: [email protected]
R295 Refresh 1
Video Display Controllers
Features
• Dual link DVI or DP input
• 2 or more DVI outputs
Examples
• CYVIZ XPO.3
• DataPath X4
• Pixell VP-4xx
• Planar Quad Controller
• Black Diamond Video – DVI splitter
330 MHz
video bandwidth
Each output up to 165 MHz
• Matrox Triple head to Go
• Etc
1:1 pixel mapping of input to output
16 BARCO Projection cubes
4x4 BARCO Projection cubes
Dual Quadro Plex 7000
Linux running Premium MOASIC
Each output runs two cubes –
[email protected]
CUBE splits signal across two
displays at 1920x1080
For Stereo 3D input is frame
doubled to 120Hz
1
2
3
4
5
6
7
8
Image courtesy of AVI-SPL
4x4 [email protected]
7680
[email protected]
6480
16 [email protected] Displays
configureMosaic set rows=2 cols=4 res=1920,2160,60
Using Linux
#Configure MOSAIC layout
nvidia-xconfig --sli=Mosaic --metamodes=
"GPU-0.DFP-0: 1920x2160+0+0, GPU-0.DFP-1: 1920x2160+1920+0,
GPU-1.DFP-0: 1920x2160+3840+0, GPU-1.DFP-1: 1920x2160+5760+0,
GPU-2.DFP-0: 1920x2160+0+2160, GPU-2.DFP-1: 1920x2160+1920+2160,
GPU-3.DFP-0: 1920x2160+3840+2160, GPU-3.DFP-1: 1920x2160+5760+2160“
#Turn off composite Desktop - this affects stereo + gsync.
nvidia-xconfig –-no-composite
#Set stereo mode. On board DIN =3;
nvidia-xconfig –-stereo=3
#Turn off twinview xinerman info - this creates a large desktop.
nvidia-xconfig --no-twinview-xinerama-info
USF – Tampa
16 thin bezel – LCD panels
720p resolution
Passive stereo – horizontal line
interlace.
4 x4 array
1
2
3
4
Dual Quadro Plex 7000
One output per card
Video processor splits across 4
cubes
1:1 pixel mapping
Image courtesy of University of South Florida - Tampa
4x8 [email protected]
4x 1366x768
4x 1366x768
32 [email protected] Displays
configureMosaic set rows=1 cols=8 res=1366,3072,60
NOTE: follow the display ordering diagrams from earlier,
this image is wired for visual clarity
Total Resolution – 10,944 x 3072
Create the Custom Resolution
If the controller does not
provide the resolution, create
one
Make sure to select a timing
other than Automatic for the
Standard
Make sure the Pixel clock on
the lower right is <= 330MHz
Set the same resolution on all
attached controllers
NVIDIA Scalable Visualization Solutions
Beyond 8 DVI Dual Link Requires
Clustered PCs with Quadro GSync to synchronize displays and
Multi GPU aware software.
Display Channels
> 8 DVI
Quadro Plex
Scalable Visualization
Solutions
(Single Host)
4-8 DVI
or
Quadro SLI Workstation
(Dual Quadro 5000/6000)
Single Workstation
(with Add-in Card)
Runs Any
Standard
Application
2-4 DVI
2-4 DP
1-2 DP
Runs Any
Standard
Application
Largest CAVE in the World
C6 at Iowa State
4 x 4K projectors per wall
6 sides
96 NVIDIA GPUs in a cluster
driving the display
Kaust University
Similar in Design to C6
Uses Quadro Plex’s to reduce
node count.
GSync II – Hardware + Software Sync
 Hardware
— RJ45 – Framelock for synchronization of
multiple displays to a common internal sync
— BNC/Genlock - Framelock for synchronization
of multiple displays to a common external
house sync
 Software
— Requires application to be written with
extensions
— Swap Group and Swap Barrier are openGL
/DirectX Extensions that provide enhanced
synchronization of the graphics swap buffer.
Vertical Sync
t0
t0 + t1
Display 1
t0
t0 + t2
Display 2
t0 + t1
Display 3
t0 + t2
• Vertical Sync is the pulse that indicates the start of the display refresh.
• To avoid tearing on a single screen the application swap buffers are
synced to vertical sync.
• Although all three displays may have the same refresh rate – vertical sync
start may be different.
• This can result in tearing between displays.
Framelock/Genlock
t0
t0
Display 1
t0
Display 2
Display 3
• Framelock/Genlock provides a common sync signal between graphics cards to
insure the vertical sync pulse starts at a common start.
• This is commonly referred to as Frame Synchronization
• Framelock – Synchronization is generated from a master node. All other nodes
would be sync to this.
• Genlock – synchronization is from an external sync generator (house sync). Each
node attached to the genlock signal is synced from that signal.
• Framelock & Genlock can be mixed in the cluster. With the master node being
synchronized from the genlock pulse
Swapbuffers
 Mono OpenGL applications have two buffers
Back
Front
The application will render into one buffer while the pixels are read to the screen
from the other buffer. Once the render process is complete the buffers swap. i.e
• Front – render
• Back – read to screen
• swap
• Back – render
• Front - read to screen.
Swapbuffers
 Swap between the two buffers will occur:
— On the first vertical sync after the Render process completes
 For example at 60Hz refresh rate we have 16.67 ms to
complete the render of a frame
— If render time = 10ms frame rate will be 60 fps (we swap on
vertical sync)
— If render time = 17 ms frame rate will be 30 fps (we swap on the
next vertical sync).
Swapbuffers in a cluster
Node 1
Node 2
Node 3
Node 4
Each node is now rendering a scene with
different complexity i.e from least to highest
we get:
1. node 3 ~ 16ms = 60fps
2. node 4 ~ 36ms = 30fps
3. node 2 ~ 53ms = 15fps
4. node 1 ~ 99ms = 10fps
• With each node running at a different rate the user would perceive tearing on the screen.
• We need a mechanism to ensure that each node will swap at the same time.
Swap Group and Swap Barrier
• Nvidia Extensions to OpenGL /DirectX (via NVAPI)
• Swap Group – provides synchronization multiple GPUs in a single host
• Swap Barrier – provides synchronization of GPUs across multiple nodes.
• Use RJ45 (framelock) connection on Gsync – so faster than sync over a
network
Node 1
Node 2
Node 3
Node 4
With Swap Barrier each node will wait until
all nodes have completed their render
1. node 3 ~ 16ms = 10fps
2. node 4 ~ 36ms = 10fps
3. node 2 ~ 53ms = 10fps
4. node 1 ~ 99ms = 10fps
32 Node cluster
Framelock (RJ45)
between nodes
Application running
Swap Barrier
Application running
Swap Barrier
Application running
Swap Barrier
Application running
Swap Barrier
GSyncII Signaling
 CAT 5 – not ethernet
— Framelock (sync pulse – will be same as House Sync)
— Swap Ready
 Physical connection to GPU for swap group.
 High when blocked, low when ready to swap.
— Stereo Sync
 VESA stereo port
 Not used for passive stereo
 Make sure stereo is enabled in Manage 3D settings on timing server + client
prior to enabling synchronization.
Driver Profiles for GSync
 Most Common (can be exceptions)
— Workstation Dynamic streaming
 Stereo
 Swap Groups
 Constant frame rate
— 3D App Visual Simulation
 Constant frame rate
3D Vision Pro with Projection systems
NVIDIA 3D Vision Pro
3D Vision Pro Glasses
120 Hz Active Shutter
2.4Ghz RF control
24 hours battery life
Support for 3D Vision Ready LCDs, Projectors,
CRT’s and DLP TVs
3D Vision Pro Hub
Up to 100 ft (30m) range
Provides UI and NVAPI information
Supports Quadro boards with stereo DIN and those
without including mobile workstation
Supports same GeForce boards and features as 3D
Vision
Wide Pro application support on Quadro
Installation - Windows
 Drivers and Guide are at
www.nvidia.com/3dvpro
 Drivers need to be installed
before the hub is connected
 Need
— 266.35 or newer display
driver
— 266.21 or newer USB driver
— Support display with refresh
rate set correctly
Consumer
3D Vision
Guides
More Complex 3D Vision Pro installs
 Projectors that require active stereo sync
 Double or Triple flash Projectors
— 60Hz input to 120Hz
— 48Hz input to 144Hz
3D Vision Pro Glasses Syncing to different timings
 3DV Pro Glasses adjust to the display or
projector they are working with
— Dark interval and timings
CRT + 3 chip DLP
projectors
 When using the glasses you’ll see the lens
“darkness” change with different devices
 Timings selected from display EDID
Single chip DLP
— If EDID is known uses programmed values
— If not recognized, uses CRT (or DLP if
connected to a DLP TV)
LCD Displays
Projectors that require active stereo sync
 Most Pro projectors require VESA stereo sync e.g.
— BARCO Galaxy
— Christie Mirage
— DPi Titan
— Projection Design F35
 Sync is used by the projector to identify left or right eye.
 Sync is looped through the projector to the hub (emitter).
— Projector has a one frame buffer.
— Projectors will delay the sync signal by one frame – reversing
left/right eye.
Projectors that require active stereo sync
 Problem
— Sync from the projector is
typically BNC
3 pin VESA connector to BNC connector.
BNC is connected to the projector
— Current Hub require 5V DC
on VESA input.
 Solution
— System integrators needs
to make special cable to
provide 5V
From Projector
Standard Pin outs for 3D Vision Pro Hub
Pin 1: Ground
Pin 2: +5V
Pin 3: Stereo Sync signal (High = Left Eye image being displayed, Low = Right Eye)
Custom Cable BNC to min-jack pinout
From Projector
3D Vision Pro Hub
+5V DC
Signal Name
Cable
BNC
3D Vision Pro - mini Jack
5Volts
ext source
N/A
2
GROUND
COAX Braid
Shell
1
Stereo L/R
COAX Center
Center
3
Double or triple flash projectors
 Take 60Hz input and double to 120Hz
 Take 48Hz input and triple to 144Hz
 Reduces overall infrastructure cost – single-link DLP.
 Problem
— Stereo sync is generated by the projector at 120 Hz
— Hub is set to 60 Hz –this is what the workstation generates
 Solution
— Command line tool that set hub to 120 Hz – runs on a proxy PC.
Management of Glasses
Management is separate of on-screen
rendering
Multiple Stereo Sources
Single PC manages pairing for all
devices
Render
Workstation
Proxy System
[email protected]
120Hz VESA Sync
Double or triple flash projectors
Command line tool
3D Vision Pro hub
usb
Sets hub to correct refresh rate
Proxy PC
Command line for setting 3DVision Pro
 nv3dvp.exe
nv3dvp.exe activateproxy display-refresh-rate
display-refresh-rate is the refresh of the stereo display
Examples:



nv3dvp.exe activateproxy 120
nv3dvp.exe activateproxy 96
nv3dvp.exe activateproxy 144
(120Hz stereo display)
(96Hz stereo display)
(144Hz stereo display)
Email: [email protected]
Summary
 Synchronization
— Focus on the image and not the artifacts
 Reliability
— 24/7 Operation
— Fortune 500 companies put their trust in Quadro
 Visual Acuity
— Ultra high resolution ‘retnia’ displays
— Reality based Design
 Application Scalability
— The applications I use on my desktop just work
Questions & a Reminder
To learn more or if have more questions – contact us at [email protected]
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement