- No category
advertisement
A Deep Dive into
Image Processing
for i.MX 6 Application Processors
FTF-CON-F0119
Oliver Brown | Senior Software Engineer
A P R . 1 0 . 2 0 1 4
TM
External Use
Agenda
• Introduction
• IPUv3
– system overview
•
IPUv3
– fundamentals
• IPU & the iMX Linux BSP
• Use case examples / tips
TM
External Use
2
Introduction
•
IPU (Image processing Unit) is present on most of i.MX products
• IPUv1 was 1st introduced on i.MX31 and upgraded on i.MX35
• IPUv3 is a family of IPs that are present on MX37, i.MX51, i.MX53, i.MX6Q/D/DL
TM
External Use
3
Introduction
•
This presentation will describe IPUv3 on i.MX5 and i.MX6.
• The slides are based on i.MX6
• IPUv3 architecture is common for all products
• The differences between one product to another are on
− Processing speed (from 133Mhz to 264Mhz)
− The modules included (CSI, VDI, ISP etc)
− The connectivity options and interfaces (HDMI, LVDS, MIPI etc)
TM
External Use
4
Introduction
• The following slide set is based on past versions of IPUv3
•
The slides used for the i.MX51’s NPI training can be found on:
• http://compass.freescale.net/doc/195331621/0201_IPUv3EX_In_MX51.ppt
•
The slides used for the i.MX53’s NPI training can be found on:
• http://compass.freescale.net/livelink/livelink/218704608/iMX53_IPUv3M.ppt?fun
c=doc.Fetch&nodeid=218704608
•
The slides used for the i.MX6’s NPI training can be found on:
• http://compass.freescale.net/livelink/livelink/224514161/Day2-1-iMX6_Dual-
Quad_NPI_Training_-_IPU.pptx?func=doc.Fetch&nodeid=224514161
TM
External Use
5
IPUv3 resources
•
•
Freescale Internal Links
IPU on Compass
−
− http://compass.freescale.net/go/ipu http://compass.freescale.net/go/ipudes
IPUv3 code examples for MX6/Q
•
•
− http://compass.freescale.net/livelink/livelink?func=ll&objId=222977460&objAction=browse&viewType=1
IPUv3 code examples
− http://compass.freescale.net/go/189478969
IPUv3 users mail list
External Links
− https://community.freescale.com/community/imx
TM
External Use
6
Video/Graphics System in i.MX6 D/Q
TM
External Use
7
Video & Graphics System in i.MX6 D/Q
Full HW Support
–> Multiple Advantages
−
−
The CPU does not have to touch pixels
–> available to run application
Optimized data path
–> reduced DDR load
–> complex use cases with only 32-bit DDR memories
−
Lower power consumption
(because of both aspects above)
Video Sources Displays
Interface Bridges
LVDS, HDMI, MIPI
IPUs
(Image Processing Unit)
Connectivity to relevant devices
Image processing: conversions, enhancement…
Synchronization and control
DCICs
Display Content
Integrity Check
VPU
(Video Processing Unit)
Video encoding and decoding
GPUs
(Graphics Processing Units)
Graphics generation
TM
External Use
8
Video/Graphics Subsystem in i.MX6 D/Q
Data
Control
Video Sources i.MX6
Dual/Quad
ARM
CPU
GPUs
Displays
DCICs
IPUs
VDOA
VPU
IRAM
External
Memories
TM
External Use
9
Multimedia Processing Chain
Image Sensor
Image Signal
Processing
Image
Conversions
Compression
Combining
with Audio
Audio Compression
Camera Preview
Memory
Communication
Network
Display
Display
Enhancement
Video/Graphics
Combining
Image
Conversions
De-blocking
De-ringing
De-compression
Separation from Audio
Audio De-compression
Graphics
Generation
•
•
•
−
−
−
Image Signal Processing
−
−
−
Bayer -> YUV conversion
Image quality enhancement
Camera corrections
Image Conversions
− De-interlacing
− Resizing (resolution adjustment)
Rotation & Inversion
Color Space Conversion
Pixel Format Conversion
(packing…)
Display Enhancement:
− Color adjustments and gamut mapping
−
−
Gamma correction and contrast stretching
Compensation for lowlight conditions and backlight reduction
TM
External Use
10
Multimedia Processing Chain
– Implementation
Image Sensor
Display
Camera
Image Signal
Processing
Image
Processing
Display
Enhancement
•
• Comprehensive HW support:
Video/graphics fully handled by
IPU, VPU and GPU.
-> The CPU does not have to touch pixels
Unit (IPU)
Video/Graphics
Combining
Camera Preview
Image
Conversions
Compression
Video
Processing
Unit (VPU)
Image
Conversions
De-blocking
De-ringing
De-compression
Graphics
Generation
Graphics
Processing
Units (GPUs)
And ARM
Combining
with Audio
Audio Compression
Memory
Communication
Network
ARM
Separation from Audio
Audio De-compression
HW Accelerated
TM
External Use
11
Display Support
TM
External Use
12
i.MX37, i.MX51, i.MX53, i.MX6 D/Q
– Display Support
•
•
•
•
How to calculate the display resolution?
FW = Frame Width
FH = Frame Height
FPS = Frame rate (fps)
BI = Blanking interval
−
Provided in the display’s DS up to 35% (1.35).
•
− Use min values
The pixel clock [MHz] is calculated according to:
F = FW X FH X FPS X BI
Few things to consider:
−
−
Data format (pixel per clock?)
Display’s clock source (DI#_CLK_EXT bit)
− The load on the display controller (DC)
TM
External Use
13
i.MX37, i.MX51, i.MX53, i.MX6 D/Q
– Display Support
Feature
Throughput # of outputs
Pixel clock rate
Resolution
(@ 60 Hz)
Interfaces Parallel
LVDS
HDMI
MIPI/DSI
Analog
Display Content Authentication (CRC)
Processing On-the-fly combining
(for high resolution displays)
Off-line combining
Display enhancement
Backlight power optimization i.MX51 (IPUv3EX) i.MX37 (IPUv3D) i.MX53 (IPUv3M) i.MX6 D/Q
(2 x IPUv3H)
Up to 133 MHz
WXGA+
(1600x900)
720p
(1280x720)
+ SVGA
(800x600)
One port; TV-out i.MX37: SDTV i.MX51: also 720p60 or 1080i/p30
2
No
Up to 200 MHz
WUXGA
(1920x1200)
1080p
(1920x1080)
+ WVGA
(800x480)
4
Up to 266 MHz per IPU
Two ports
24 bits + 18 bits
Synchronous (for display refresh) and asynchronous (to memory)
Very flexible - glue-less connection to RAM-less displays, display controllers, and TV encoders.
No
No
Two channels; consumer version (multiple pairs); 2x 85 MHz or 170MHz
One port
No
2 planes
Also VGA
Rate increased to 1080p60
(up to 3 more planes for lower resolutions)
2 x 4XGA
(2048x1536)
2 x [1080p + WXGA
(1280x720)
]
Two ports
24 bits + 24 bits
One port, 2 lanes x 1 Gbps
(non-Automotive)
None (phased out)
Yes, for 2 displays
For 2 displays, 2 planes for each
(6 more planes for lower resolutions)
Up to 20 MP/sec Up to 200+ MP/sec Up to 500+ MP/sec
Color adjustment and smart gamut mapping; gamma correction and contrast enhancement
Supporting effective proprietary algorithms
Yes; Supporting efficient proprietary algorithms
TM
External Use
14
•
•
IPU in i.MX6 D/Q (IPUv3H)
– Maximal Resolution & Refresh
Rate
Name
Resolution
Width x Height Total [MP]
Maximal
Refresh
Rate [Hz]
Capabilities
− Maximal display resolution: 4096x4096 pixels
− Maximal pixel rate: 264 MP/sec (200MP/sec on MX53, 133MP/sec on MX51)
−
−
Display refresh rate
−
The maximal refresh rate is: 264M / (W * H * B)
W*H is the display resolution
B is a factor >1 reflecting blanking overhead, e.g. as specified by VESA, CEA-861-
D, etc.
The table provides the maximal refresh rates for some typical resolutions
Usually, the refresh rate is required to be at least 60 Hz, to prevent blinking.
−
−
−
The blanking overhead factor assumed for the calculation is 1.3.
The actual factor depends on the display and is often closer to 1, allowing higher resolutions @ 60 Hz (e.g. HD1440).
For example, for HD1080, the standard specifies B~1.2
This is the capability of each of the two IPUs, so the total capability of the processor is doubled.
Note: these rates refer only to screen refresh, gated by the capabilities of the display port. A full use case typically includes additional activities and to confirm its support with a given refresh rate, additional aspects
– video processing capabilities, capacity of the memory system, etc.
– should be also analyzed carefully.
VGA
PAL
WVGA
NTSC
SVGA
WSVGA
XGA
HD720
WXGA
WXGA+
SXGA
SXGA+
WSXGA+
UXGA
HD1080
640
720
800
720
800 x x x x x
1024 x
1024 x
1280 x
1366 x
1440 x
1280 x
1400 x
1680 x
1600 x
1920 x
480
480
480
576
600
600
768
720
768
900
1024
1050
1050
1200
1080
0.31
0.35
0.38
0.41
0.48
0.61
0.79
0.92
1.05
1.30
1.31
1.47
1.76
1.92
2.07
666
592
533
493
426
333
260
222
195
158
156
139
116
107
99
WUXGA 1920 x 1200 2.30 89
9VGA 1920 x 1440 2.76 74
800 x 480 (WVGA)
~331 Hz
4XGA 2048 x 1536 3.15 65
HD1440 2560 x 1440 3.69 56
1280 x 720 (HD720) ~138 Hz
4WXGA 2560 x 1600 4.10 50
1024 x 768 (XGA) ~161 Hz
4K x 2K 4096 x 2048 8.39 25
1920 x 1080 (HD1080) ~61 Hz
TM
2048 x 1536 (4XGA) ~40 Hz
External Use
15
IPU in i.MX6 D/Q (IPUv3H)
– Dual-Display Capabilities
First Display
Second Display
WXGA
(1366x768 ~ 1.0 MP; 72-85 MHz)
SXGA
(1280x1024 ~ 1.3 MP; 91-109 MHz)
SXGA+
(1400x1050 ~ 1.5 MP; 101-122 MHz)
WSXGA+
(1680x1050 ~ 1.8 MP; 119-146 MHz)
UXGA
(1600x1200 ~ 1.9 MP; 130-161 MHz)
WUXGA
(1920x1200 ~ 2.3 MP; 154-193 MHz)
9VGA
(1920x1440 ~ 2.8 MP; 185-234 MHz)
4XGA
(2048x1536 ~ 3.2 MP; 209-267 MHz)
SDTV
480i30/576i25
(27 MHz)
Full
Full
Full
Full
Full
Full
Partial
Partial
WSVGA
(1024x600)
(44-49 MHz)
Full
Full
Full
Full
Full
Partial
Partial
HDTV
720p60/1080i30
(74.25 MHz)
Full
Full
Full
Full
Full
Partial
WXGA
(1366x768)
(72-85 MHz)
Full
Full
Full
Full
Partial
WSXGA+
(1680x1050)
(119-146 MHz)
Full
Partial
Partial
Partial
HDTV
1080p60
(148.5 MHz)
Full
Partial
•
−
−
−
−
−
Notes
This is the capability of each of the two IPUs, so the total capability of the processor is doubled.
The maximal pixel clock rate supported by the display ports
Each display: 220 MHz
Total: 240 MHz
For a TV, the clock rate is fixed by the corresponding standards
For other displays
•
•
The assumed screen refresh rate is 60 Hz
The blanking overhead
– impacting the pixel clock rate – may vary between displays.
The table refers
– for concreteness – to the VESA CVT (Coordinated Video Timing) specification
“
Full support
”: allowing full blanking (which is typically required for CRTs)
“
Partial support
”: allowing only reduced blanking (which is still typically sufficient for digital displays, e.g. LCDs)
The above table describes only the capabilities of the display ports to perform screen refresh. A full use case typically includes additional activities and to confirm its support with a given display configuration, additional aspects
– video processing capabilities, capacity of the memory system, etc.
– should be also analyzed carefully.
TM
External Use
16
i.MX6 D/Q Display Ports Muxing
IPU #0
DI0
DI1 DI0
IPU #1
DI1
Parallel
#0
MIPI
DSI
Parallel
#1
DCIC #0
Lockable control
DCIC #1
Lockable control
LVDS #0 LVDS #1
HDMI
TM
External Use
17
MIPI
DSI
•
•
•
Six ports
−
−
−
−
Two parallel - driven directly by the
IPU
Two LVDS channels - driven by the LVDS bridge
One HDMI
– driven by the HDMI transmitter
One MIPI-DSI
– driven by the
MIPI-DSI transmitter
Four simultaneous outputs
−
Each IPU has two display ports
(DI0 and DI1)
−
− Additional asynchronous data flows can be sent through the parallel ports and the MIPI-DSI port
Display Content Integrity Check
(DCIC)
−
−
Therefore, up to four external ports can be active at any given time.
For parallel interfaces: probes the
I/O loopback (essentially equivalent to probing the external wires)
For other integrated interfaces
(e.g. LVDS): probes the IPU output (essentially equivalent to the inputs to the serializers)
Max Display Port Resolutions on i.MX6Q/D
•
•
•
•
MIPI DSI, 2 lanes
− WXGA (1366 x 768) or 720p (1280 x 720)
RGB
−
−
Port 1
– 4XGA (2048 x 1536)
Port 2
– 4XGA (2048 x 1536)
LVDS
−
−
Single channel
– WXGA (1366 x 768) or 720p (1280 x 720)
Dual channel
– UXGA (1600 x 1200) or 1080p (1920 x 1080)
HDMI
− 1080p (1920 x 1080) or 4XGA (2048 x 1536)
Note: Assuming 30% blanking intervals overhead, 24bpp, 60fps
TM
External Use
18
i.MX51
Port Name (x=1,2)
DISPx_DAT0
DISPx_DAT1
DISPx_DAT2
DISPx_DAT3
DISPx_DAT4
DISPx_DAT5
DISPx_DAT6
DISPx_DAT7
DISPx_DAT8
DISPx_DAT9
DISPx_DAT10
DISPx_DAT11
DISPx_DAT12
DISPx_DAT13
DISPx_DAT14
DISPx_DAT15
DISPx_DAT16
DISPx_DAT17
DISPx_DAT18
DISPx_DAT19
DISPx_DAT20
DISPx_DAT21
DISPx_DAT22
DISPx_DAT23
DIx_DISP_CLK
DIx_PIN1
DIx_PIN2
DIx_PIN3
DIx_PIN4
DIx_PIN5
DIx_PIN6
DIx_PIN7
DIx_PIN8
DIx_D0_CS
DIx_D1_CS
DIx_PIN11
DIx_PIN12
DIx_PIN13
DIx_PIN14
DIx_PIN15
DIx_PIN16
DIx_PIN17
Connecting a display on the parallel interface
RGB, Signal name
(General)
DAT[0]
DAT[1]
16 bit RGB
B[0]
B[1]
DAT[2]
DAT[3]
DAT[4]
DAT[5]
DAT[6]
DAT[7]
DAT[8]
DAT[9]
DAT[10]
DAT[11]
DAT[12]
DAT[13]
DAT[14]
DAT[15]
DAT[16]
G[5]
R[0]
R[1]
R[2]
R[3]
R[4]
B[2]
B[3]
B[4]
G[0]
G[1]
G[2]
G[3]
G[4]
DAT[17]
DAT[18]
DAT[19]
DAT[20]
DAT[21]
DAT[22]
DAT[23]
18 bit RGB
B[0]
B[1]
G[4]
G[5]
R[0]
R[1]
R[2]
R[3]
R[4]
B[2]
B[3]
B[4]
B[5]
G[0]
G[1]
G[2]
G[3]
R[5]
LCD
24 bit RGB
B[0]
B[1]
G[2]
G[3]
G[4]
G[5]
G[6]
G[7]
R[0]
B[2]
B[3]
B[4]
B[5]
B[6]
B[7]
G[0]
G[1]
R[1]
R[2]
R[3]
R[4]
R[5]
R[6]
R[7]
PixCLK
HSYNC
VSYNC
DRDY/DV
RGB/TV Signal Allocation (Example)
8 bit YCrCb
Y/C[0]
Y/C[1]
Y/C[2]
Y/C[3]
Y/C[4]
Y/C[5]
Y/C[6]
Y/C[7]
16 bit YCrCb
C[0]
C[1]
C[2]
C[3]
C[4]
C[5]
C[6]
C[7]
Y[0]
Y[1]
Y[2]
Y[3]
Y[4]
Y[5]
Y[6]
Y[7]
TM
External Use
19
20 bit YCrCb
C[0]
C[1]
Y[0]
Y[1]
Y[2]
Y[3]
Y[4]
Y[5]
Y[6]
C[2]
C[3]
C[4]
C[5]
C[6]
C[7]
C[8]
C[9]
Y[7]
Y[8]
Y[9]
Smart
CS0
CS1
WR
RD
RS1
RS2
DRDY
Signal name
DAT[0]
DAT[1]
DAT[2]
DAT[3]
DAT[4]
DAT[5]
DAT[6]
DAT[7]
DAT[8]
DAT[9]
DAT[10]
DAT[11]
DAT[12]
DAT[13]
DAT[14]
DAT[15]
VSYNC_IN
IPUv3H
– Video In Support
TM
External Use
20
i.MX51, i.MX53, i.MX6 D/Q
– Video In Support
Feature
Video Input
Interfaces
Parallel
Video Rate
MIPI/CSI-2
Playback
Record
2-way
Video Processing De-interlacing
Resizing
Rotation/inversion
Color conversion
Memory Interface Protocol
Throughput
Efficient memory bus utilization
Control capabilities
Synchronization
(to prevent tearing) i.MX51 (IPUv3EX) i.MX53 (IPUv3M) i.MX6 D/Q
(2 x IPUv3H)
120 MHz; e.g. 6 MP @ 15 fps
720p30
(1280x720)
D1
@ 30 fps
(720x480@30 fps or 720x576@25 fps
)
No
Two ports, 20 bits + 8 bits
180 MHz; e.g. 9 MP @ 15 fps
1080i/p
(1920x1080)
@ 30 fps
720p30
(1280x720)
@ 30 fps
720p @ 20 fps
200 MHz; e.g. 10 MP @ 15 fps
One port, 4 lanes x 1 Gbps
1080i/p + D1 @ 30 fps
1080p @ 30 fps
720p @ 30 fps
64-bit, 133 MHz
High-quality motion adaptive algorithm
Yes
– fully flexible
Yes
Yes
– fully flexible
AXI
– Including split transaction
64-bit, 200 MHz
Selective read for combining
64-bit, 266 MHz
Display controller, DMA controller, Internal synchronization
Autonomous operations: display refresh/update, view-finder
Double/triple buffering
Frame-by-frame or tight
– sub-frame (utilizing internal memory)
TM
External Use
21
Video Input Ports In i.MX6 D/Q
•
•
•
−
Three ports; up to six input channels
−
−
Two parallel
– connected directly to the IPUs; independent clock and format setting
One MIPI/CSI-2
– can transfer up to four concurrent channels
Each port: up to 150Mpxl/s @200MHz, e.g. 10Mpxl @ 15fps
Four concurrent channels
−
−
−
Each IPU has two input ports (CSI0 and CSI1), each can process an input channel from one of the external ports.
The MIPI/CSI-2 bridge sends all its channels to all the IPU input ports and each port can select for processing a different channel, identified by its DI (Data Identifier).
Additional channels can be transferred through a CSI transparently
– as generic data – directly to the system memory.
Formats supported:
− BT.656
Video
Sources i.MX6 D/Q
− BT.1120
− YUV422, RGB888, YUV444 = over an 8 bit bus
− RAW format up to 16bpp which will be translated to 8 bit using companding
− Generic data up to 20bit
Parallel 0
MIPI/CSI-2
MIPI/CSI-2
Receiver
Parallel 1
CSI0
CSI1
CSI1
CSI0
IPU #0
IPU #1
TM
External Use
22
IPUv3H
– The camera port
CSI
(Camera
Sensor I/F)
CSI
(Camera
Sensor I/F)
Cameras
SMFC
(Sensor
Multi FIFO Ctrl.)
VDI (Video
De-Interlacer)
IDMAC
(Image
DMA
Controller)
64-bit
AXI
Memory
IC
(Image
Converter) display
TM
External Use
23
The Camera Sensor Interface - CSI
•
•
•
•
•
•
Role: controls the camera port
− Provides direct connectivity to relevant image sensors and connectivity bridges: CSI-2, HDMI receiver, TV decoder…
Data bus
– up to 20 bits
−
−
Single value
– up to 16 bits
Two values
– up to 10 bits each; e.g. HDTV YUV 4:2:2 input
Variety of data formats
− Main (with on-the-fly processing): YUV 4:2:2/4:4:4, RGB 16/24 bpp
−
−
Other: as generic data, including compressed streams
All primary CSI-2 formats
Frame resolution
− Up to 8192 x 4096 pixels
Input rate
− 240M values/sec peak (@ 264 MHz internal clock)
Additional features
−
−
Frame rate reduction
– by skipping (reduction ratio: m:n, m<=n<=12)
Window-of-interest selection
– by cropping
TM
External Use
24
i.MX37, i.MX51, i.MX53, i.MX6 D/Q
– Video in Support
•
•
•
How to calculate the video in pixel clock?
FW = Frame Width
FH = Frame Height
• FPS = Frame rate (fps)
•
•
BI = Blanking interval
Provided in the device’s DS up to 35% (1.35).
•
•
D = Data format
How the data is arranged on the bus
– (cycles/pixel)
The pixel clock [MHz] is calculated according to:
F = FW X FH X FPS X BI X D
TM
External Use
25
16-bit camera support
•
•
•
16-bit YUV422
− CSI receives 2 components per cycle.
− CSI [16 bit generic data.] => SMFC => MEM [16bit generic data] => IPU[YUV422]
−
−
16-bit RGB as generic data
− CSI receives 3 components per cycle.
Use a 16 bit sample of it (such as RGB565)
CSI [16 bit generic data.] => SMFC => MEM [16bit generic data] => IPU [16 bit RGB] =>
IPU [map to 24bpp RGB]
−
−
−
−
16-bit RGB565
− On the fly processing of 16 bit data.
CSI is programmed to receive 16 bit generic data.
The interface is restricted to be in "non-gated mode" and the CSI#_DATA_SOURCE bit has to be set
If the external device is 24bit - the user can connect a 16 bit sample of it (RGB565 format).
The IPU has to be configured in the same way as the case of
CSI#_SENS_DATA_FORMAT=RGB565
TM
External Use
26
BT.1120/BT.656 support
•
BT.656
− CCIR progressive/interlaced modes
• BT.1120
− CCIR progressive/ interlaced mode
− SDR/DDR mode
•
The timing reference signals (Sync events) are embedded in the data.
• The CCIR codes are defined in the standard. IPU can support non standard codes using the CCIR registers.
• On the fly processing is supported in both modes
TM
External Use
27
BT.1120/BT.656 support
•
IPUv3 is an 8bit per component system.
• In a 10bit data inputs there are few ways to handle the data
• Companding - programmable piecewise-linear map
• Regular and tight packing to memory
BT.1120 mode connectivity companded regular packing
20bit, YUV422-10 D[19:0] {8DC,8DC,8DC,8R} {16DE,16DE,16DE,16R }
tight packing
{10DE,10DE,10DE,2R}
20bit, YUV422-8 D[19:12], D[9:2] {8DC,8DC,8DC,8R} {8D,8D,8D,8R} NA
DC - data after being companded
DE - data after being extended.
R- reserved bits
TM
External Use
28
i.MX37, i.MX51, i.MX53, i.MX6 D/Q
– Video in Support
Data Format comment
RGB888/YUV444
YUV422
YUV422
Generic data
RGB565
RGB565
BT.1120
BT.1120
BT.656
Bayer
CSI Bus width
8
8
16
16
8
16
20
16
8
16
D
(cycles/pixel)
1
1
3
2
1
2
1
1
2
1
D[19:12]
D[19:12]
D[19:4]
D[19:4]
D[19:12]
D[19:4]
D[19:0]
D[19:12], D[9:2]
D[19:12]
D[19:4]
F = FW X FH X FPS X BI X D
TM
External Use
29
CSI data mapping
CSI1_D19
CSI1_D18
CSI1_D17
CSI1_D16
CSI1_D15
CSI1_D14
CSI1_D13
CSI1_D12
CSI1_D11
CSI1_D10
CSI1_D9
CSI1_D8
CSI1_D7
CSI1_D6
CSI1_D5
CSI1_D4
CSI1_D3
CSI1_D2
CSI1_D1
CSI1_D0
RGB888/
YUV444
D7
D6
D5
RGB565
8bit
D7
D6
D5
RGB565
16bit YUV4:2:2
Generic data BT.656
YUV422
16 bit
BT.1120
(YUV422-10)
BT.1120
(YUV422-8)
R4
R3
R2
D7 MSB
D6 MSB-1
D5 MSB-2
D7
D6
D5
Y7
Y6
Y5
Y9
Y8
Y7
Y7
Y6
Y5
D4
D3
D2
D4
D3
D2
R1
R0
G5
D4 MSB-3
D3 MSB-4
D2 MSB-5
D4
D3
D2
Y4
Y3
Y2
Y6
Y5
Y4
Y4
Y3
Y2
D1
D0
D1
D0
G4
G3
G2
G1
G0
B4
B3
B2
D1 MSB-6 D1
D0 MSB-7 D0
MSB-8
MSB-9
MSB-10
MSB-11
MSB-12
MSB-13
Y1
Y0
CrCb7
CrCb6
CrCb5
CrCb4
CrCb3
CrCb2
Y3
Y2
Y1
Y0
CrCb9
CrCb8
CrCb7
CrCb6
Y1
Y0
0
0
CrCb7
CrCb6
CrCb5
CrCb4
B1
B0
MSB-14
MSB-15
CrCb1
CrCb0
CrCb5
CrCb4
CrCb3
CrCb2
CrCb1
CrCb0
CrCb3
CrCb2
CrCb1
CrCb0
0
0
TM
External Use
30
IPUv3H
– Fundamentals
TM
External Use
31
The Image Processing Unit
Video
Sources
Displays
Camera
Interface
IPU
Sync &
Control
MCU
Processing
Including
Image
Enhancements
And
Conversions
Memory
Interface
Memory
Display
Interface
• Functions: comprehensive support for the flow of data from an image sensor and/or to a display device.
− Connectivity to relevant devices
− Related image processing and manipulation
− Synchronization and control capabilities
TM
External Use
32
IPUv3H
– Internal Structure
Cameras
CSI
(Camera
Sensor I/F)
Displays
SMFC
(Sensor
Multi FIFO Ctrl.)
IPUv3H
VDI
(Video
De-Interlacer)
DI
(Display I/F)
IC
(Image
Converter)
DP
(Display
Processor)
DC
(Display Contr.)
CM
(Control
Module)
DMFC
(Display
Multi FIFO Ctrl.)
IRT
(Image
Rotator)
TM
32-bit AHB
MCU
External Use
33
IDMAC
(Image
DMA
Controller)
64-bit
AXI
Memory
IPUv3 Fundamentals - The display port
•
The display port handles all the IPUv3 features targeted for controlling and sending data to the display.
• The display port consists of 4 modules:
•
DC - a display controller
•
DP - a display processor
• DMFC - a display multi-FIFO controller
• DI - a display interface. The DI is instantiated twice to provide two symmetrical display interfaces.
TM
External Use
34
IPUv3 Fundamentals - Supported display interfaces
•
The total number of supported displays by IPUv3 is 4.
• The display port has 2 DI interfaces.
• Each interface can handle up to 3 displays.
• Each DI can handle up to 2 asynchronous interfaces (e.g. Smart
LCD, Graphic accelerator) - only one of them can be serial interface.
• Each DI can handle one synchronous interface (e.g. TV, dumb
LCD).
TM
External Use
35
IPUv3 Fundamentals
– display channels’ mapping
•
•
The display port supports multiple flows that may have different characteristics
In order to configure the IPU we need to identify some of the flow’s characteristics and allocate the IPUv3’s resources that will participate in that flow.
• First we need to understand how the channels are distributed.
TM
External Use
36
IPUv3 Fundamentals
– display channels’ mapping
27
28
29
21
Ch #
23
24
Destination
DC
DP - primary
DP - Primary
DP -Secondary
DC
DP -Secondary
DMFC/DC numbering
5B/5
6B/6
5F/5
1/1
6F/6
Flow’s nature
Alpha channel
SYNC or
ASYNC
NA
SYNC
ASYNC
51
52 comment
Direct flow via IC. If this flow is used it replaces once of the DMFC channels.
Ch 23 is associated with ch27. when there’s only one plane in the flow – this channel should be used.
Ch 24 is associated with ch29.
2 ASYNC flows can use this channel via alternate flow. when there’s only one plane in the flow
– this channel should be used.
SYNC
SYNC or
ASYNC
ASYNC
31
NA
33 if ch28 is connected to DI0 then ch23 must be connected to DI1
TM
External Use
37
IPUv3 Fundamentals
– display channels’ mapping
40
41
42
43
44
Ch # Destination DMFC/DC numberin g
DC 0/0
Flow’s nature
Read
DC
DC
2/2
1C
ASYNC
Alpha channel
NA
NA
Command NA
DC
DC
2C
3
Command
Mask
NA
NA comment
Refer to the spec for the command channel restrictions
Refer to the spec for the command channel restrictions
Mask channel can be associated with ch 23 or ch
28
TM
External Use
38
IPUv3 Fundamentals - DI
• The DI is responsible for the timing waveforms of each signal in the display’s interface.
• The DI is composed of
−
8 sets of waveform generators controlling signals associated with the DI’s clock;
These signals drive PIN1-PIN8. These pins can be used for signals like
VSYNC,HSYNC
−
12 sets of waveform generators controlling signals associated with the data;
These signals drive PIN11-PIN17 + 2 CS signals.
These pins can be used for signals like DRDY,CS,RS
− The DI generates the clock to the display
The DI clock can be derived from the IPUv3 hsp_clk
The DI clock can be derived from an external to the IPU clock (PLL or pin)
TM
External Use
39
IPUv3 Fundamentals - DI
•
This waveform describe how the display clock’s parameters are set.
TM
External Use
40
IPUv3 Fundamentals - DI
•
This waveform describe how the DI’s PIN parameters are set.
TM
External Use
41
IPUv3 Fundamentals - DI
•
This waveform provides an example of waveform concatenation.
TM
External Use
42
IPUv3 Fundamentals - DC
•
The DC (Display Controller) is responsible for:
• Activation of a flow
−
When there’s new content to be displayed (asynchronous flow)
− Upon an internal timer (synchronous flow)
• Linkage between the microcode to
− mapping units (within DC)
− and timing units (within DI)
TM
External Use
43
IPUv3 Fundamentals -
DC
uCODE
Routine
Events
(NL, NF, New Data…) uCODE
Routine uCODE
Routine
TM
External Use
44
Mapping
Mapping unit
Unit (DC)
(DATA handling)
(timing characteristics)
IPUv3 Fundamentals - DMFC
•
The DMFC is a multi FIFO controller utilizing a single memory to serve the DC and DP channels.
• The FIFO is partitioned to 8 equal segments. The segments can be allocated asymmetrically to the channels.
• The memory allocation for a specific channel must not be greater than a certain number of rows. The exact number is different from one channel to another (see the spec).
• When the direct path from the IC is used, this channel replaces one of the existing DMFC channels.
TM
External Use
45
The Image DMA Controller
– IDMAC
•
•
•
•
•
Role: control the memory ports; transfer data to/from system memory
Memory ports - AXI
− IMDAC: 1 read, 1 write
Throughput:
− External: 64 bits @ 264 MHz
− Internal: up to 2 pixels/cycle @ 264 MHz (through each port)
−
−
Shared by all DMA channels: input from sensor, output to display and off-line processing
-> efficient utilization of the bandwidth in different use cases
Efficient pipelining: 4 AXI ID’s; multiple outstanding transactions: read – 8; write – 6
Data arrangement in memory
− Row-after-row, with flexible line-stride
– as needed for a window in a video/graphics buffer
Access order
−
−
Block-by-block
– for rotation
Row-by-row
– used for all other channels
Needed for output to display or input from sensor
Decreases the memory bus load by increasing its utilization efficiency
TM
External Use
46
The Image DMA Controller
– IDMAC (cont.)
•
•
A variety of pixel formats
−
YUV 4:2:0/4:2:2/4:4:4
– for video
−
−
Conventionally packed RGB pixels
– 8/16/32 bpp – for graphics
Tightly packed RGB pixels
– 12/18/24 bpp
For reduced load on the memory bus and for power-efficient screen refresh
− Optional independent alpha (translucency) input
For planes that do not have interleaved alpha
− Coded color (using a LUT; 4/8 bpp)
−
An additional option to reduce bus load and power
Gray scale
− Generic data
Additional features
−
Conditional read (for combining)
– transparent pixels are not read
Pixel transparency
– identified from the independent alpha input
−
−
−
Scrolling and panning
Uniform programming model for all channels
(stored in the CPMEM
– channel parameter memory)
TM
External Use
47
IPUv3 Fundamentals - IDMAC
•
The IDMAC of IPUv3 is connected to the AXI bus.
− Full separation between read and write
− All the channels are symmetric
− 64-bit AXI bus, internal bus of 128-bit (4 pixels)
• Each channel uses 2X160 words of the CPMEM memory, holding the channels settings
• Ability to use alternate rows in the CPMEM for alternate flows
• Usage of alpha located in separate buffers (ATC)
• Dynamic and static arbitration between channels.
• Prioritizing screen refresh channels over other IPU channels and other DDR masters
TM
External Use
48
DP (Display Processor)
To
DC
4x3x8 bits
Output
FIFO
Color
Conversion
& Correction
Gamma
Correction
& Contrast
Stretching
Cursor
Overlay
Alternative
Locations
DP
Combining
Color
Conversion
& Correction
Color
Conversion
& Correction
Input
FIFO
Input
FIFO
Primary
Secondary
Main Plane
(full screen)
From
DMFC
4x4x8 bits
Primary
Secondary
Additional
Plane
• DP has following features;
−
Support input format YUVA/RGBA
− Combining 2 video/graphics planes
− Color conversion (YUV <-> RGB, YUV<->YUV) & Correction (gamutmapping)
−
Gamma correction and Contrast stretching
− Support output format YUV/RGB
− Dynamic task switching between async and sync flows
TM
External Use
49
IPUv3 Fundamentals - DP
•
The DP handles the content of the frame.
• It performs image processing on the way to the display (combining,
CSC, gamma correction)
•
The DP supports one synchronous flow and two asynchronous flows.
• All the DP configuration is done only via the SRM. The control module handles the automatic switch between DP settings when the DP switches from one flow to another.
TM
External Use
50
IC (Image Converter)
Horizontal
Bilinear
Resizing
Resizing
Row Buffers
Vertical
Bilinear
Resizing
Down-Sizing
Row Buffers
Color
Conversion
& Correction
Power-Of-Two
Down-Sizing
Combining
Color
Conversion
& Correction
IC
Video
Input FIFOs
Graphics
Input FIFOs
Output FIFOs
From IDMAC
From IDMAC
To IDMAC/DMFC
•
•
•
•
Resizing
−
Fully flexible resizing ratio
Maximal downsizing ratio: 8:1
Maximal upsizing ratio:
1:8192
− Independent horizontal and vertical resizing ratios
Color conversion/correction
−
YUV <-> RGB, YUV <-> YUV conversion
Combining with a graphic plane
Max output width 1024 pixels.
Larger images are processed in stripes
TM
External Use
51
The Image Rotator - IRT
•
•
•
Role: performs rotation and inversion
− Rotation: 90, 180, 270 degrees
− Inversion: horizontal and vertical
Rate: up to 100M pixels/sec
− (depends on use case)
Additional features
− Acts on 8x8 blocks
− Multi-tasking: up to three tightly time-shared tasks
– block-by-block
− Pixel format: 24-bit
TM
External Use
52
The Video De-Interlacer or combiner - VDIC
•
•
•
•
•
Role 1: performs de-interlacing
– converting interlaced video to progressive
Method: a high-quality motion adaptive filter
−
For slow motion
– retains the full resolution (of both top and bottom fields), by using temporal interpolation
− For fast motion
– prevents motion artifacts, by using vertical interpolation
Resolution: field size up to 968x1024 for i.MX6 and 720x1024 in i.MX5 pixels. Larger frames are processed in stripes (split mode).
Output rate: up to 120M pixels/sec
−
−
Additional features
− Uses three input fields for each output frame
(the minimum needed for a reliable motion detection)
Vertical interpolation
– 4-tap filter; using an internal row buffer
Single concurrent flow
− Input may come from a video decoder (VPU) or directly from the CSI
TM
External Use
53
The Video De-Interlacer or combiner - VDIC
•
Role 2 : performs combining
– overlaying of 2 frames at the same color space
•
As an alternative to the de interlacing function the VDIC HW can perform combining
− Combining of 2 planes
−
Doesn’t have to be of the same size
−
Must be of the same color space (no CSC)
− Perform 1 pixel per cycle
−
Color keying, alpha blending
TM
External Use
54
IPUv3H
– Combining
TM
External Use
55
IPUv3
– Basic Combining Capabilities
Combining in the Display Processor (DP)
Two planes
• One plane may have any size and location
• The other one must be “full-screen” (cover the full output area)
Maximal rate: i.MX37/51
– 133 MP/sec, i.MX53 – 200 MP/sec, i.MX6 Dual/Quad – 264 MP/sec
• Combining methods (in both cases)
− Color keying and/or alpha blending
− Alpha: global or per-pixel; interleaved with the pixels (upper plane) or as a separate input
i.MX37/51/53/6 Dual/Quad
DI
DI
DC
IPUv3
DP
IC
External
Memory
Plane 1
Plane 2
Plane 3
Plane 4
Note: This is the capability per IPUs, so the total capability of the processor is doubled in i.MX6DQ.
Combining in the Image Converter (IC)
Two planes; both “full-screen” (cover the full output area)
Maximal rate: i.MX37/51
– 20 MP/sec , i.MX53 – 30 MP/sec, i.MX6 Dual/Quad – 40 MP/sec
TM
External Use
56
IPUv3
– Off-Line Combining
i.MX37/51/53/6 Dual/Quad
External
Memory
Result
IPUv3
VDIC
Or IC
Top plane
Temporary
• Unlimited number of planes combined sequentially
VDIC
Or IC
Note: This is the capability per IPUs, so the total capability of the processor is doubled in i.MX6DQ.
Same HW
VDIC
Or IC
Combining in the VDIC (Video De-Interlacer & Combiner)
– i.MX53/6 Dual/Quad only
• Available when de-interlacing is not needed
• Two planes; each may have any size and location (supplemented by a “background color”)
• Maximal rate: i.MX53 – 180 MP/sec, i.MX6 Dual/Quad – 240 MP/sec
• Combining method – as in the DP and IC
Plane 3
Temporary
Plane 2
Plane 1 (bottom)
TM
External Use
57
IPUv3
– Maximal On-The-Fly Combining To A Single Display
3-planes i.MX37/51
– up to 20 MP/sec
4-planes i.MX53
– up to 30 MP/sec i.MX6
Dual/Quad
– up to 40 MP/sec i.MX37/51
DI
DC
DP
IPUv3
IC
External
Memory
Plane 3 (top)
Plane 2
Plane 1 (bottom)
Note: the bottom plane may be a result of additional offline combining of several planes
Note: This is the capability per IPUs, so the total capability of the processor is doubled in i.MX6DQ.
i.MX53/6 Dual/Quad
External
Memory
Plane 4 (top)
DI DP
DC
Plane 3
IC
IPUv3
VDIC
Plane 2
Plane 1 (bottom)
TM
External Use
58
i.MX6 Dual/Quad: On-The-Fly Combining Using 2x IPUv3
Example 1: 2x 4 planes
i.MX6 Dual/Quad
External
Memory
Plane 8 (top)
DI
DC
DP
Plane 7
IC
IPUv3 - 1
VDIC
Plane 6
Plane 5 (bottom)
Example 2: 1x 7 planes
i.MX6 Dual/Quad
External
Memory
Plane 7 (top)
DI
DP
DC
Plane 6
IC
IPUv3 - 1
CSI
VDIC
Plane 5
Example 3: 4x 2 planes
i.MX6 Dual/Quad
DI
DI
DC
IPUv3 - 1
DP
IC
External
Memory
Plane 8
Plane 7
Plane 6
Plane 5 (bottom) i.MX6 Dual/Quad
DI
DC
DP
IC
IPUv3 - 0
VDIC
External
Memory
Plane 4 (top)
Plane 3 i.MX6 Dual/Quad
DI
DC
DP
IC
IPUv3 - 0
VDIC
External
Memory
Plane 4
Plane 3 i.MX6 Dual/Quad
DI
DI
DC
IPUv3 - 0
DP
IC
Plane 2
Plane 1 (bottom)
Plane 2
Plane 1 (bottom)
Note: Some planes may be a result of additional off-line combining of several planes. Such combining may be performed either with IPUs or GPUs.
External
Memory
Plane 4
Plane 3
Plane 2
Plane 1 (bottom)
TM
External Use
59
IPUv3 combining capabilities
– summary
Output
60
DP
Display
Relations between planes
PlaneA <= PlaneB
Color space conversion Yes
Performance 1 cycle/pixel
HW cursor
Output Image size
Color keying
Alpha blending
32x32 unified color
FW up to 2048
IC
Memory*
PlaneA = PlaneB
Yes
4 cycle/pixel
No
FW up to 1024
Yes
Yes
• *The output of the IC can be sent directly to a smart display
VDIC
Memory
PlaneA <= PlaneB
No
1 cycle/pixel
No
FW up to 1920
TM
External Use
60
IPUv3 Fundamentals
– programming steps
This is a high level example of IPUv3 programming flow.
1.
What are the displays connected in the use case? a.
Allocated the displays to each DI
2.
b.
a.
b.
Define the timing characteristics of each signal for each display.
Define each flow in the DC sync/async
Define the events that trigger the flow
– and what to do upon their arrival c.
d.
3.
4.
a.
b.
5.
6.
a.
b.
Allocate mapping unit, and mapping scheme
Allocate waveform generator in the DI
Configure the DP for each flow
Configure the IDMAC
How the data is arranged in the memory (interleaved/not interleaved)
What’s the data’s format (PFS, BPP) , mapping
Processing: VDIC, IC (Resizing, CSC) and rotation settings
Control module configuration for activation of a flow
Define the trigger to start a flow
Define is the processing chain
TM
External Use
61
IPUv3H
– Control
TM
External Use
62
The control Module - CM
•
The control module is responsible for the flow management within the IPU.
• The module is composed of
− General control Registers (GCR)
− Frame synchronization unit (FSU)
− Shadow registers module (SRM)
− Interrupt controller
− Low Power modes controller
− Debug unit
TM
External Use
63
FSU
– double buffering
•
Similar to IPUv1 the data is tightly pipelined using double buffering
TM
External Use
64
FSU
– task chaining
65
TM
External Use
65
Peripherals
TM
External Use
66
MIPI DSI
The DSI MIPI Interface is a digital core accompanied with a multi-lane D-PHY that implements all protocol functions defined in the MIPI DSI Specification, providing an interface between the System and MIPI DSI compliant Display
•
•
•
•
•
•
•
Features of the MIPI DSI complex:
Supported standard version:
MIPI DSI Compliant
DSI Version 1.01
DPI Version 2.0
DBI Version 2.0
DSC Version 1.02
PPI for D-PHY
MIPI D-PHY Version 1.0
•
•
Configuration: one clock lane, two data lanes
Speed: Up to 1Gb/s per lane (fast speed). Low speed/low power signaling supported
•
•
DSI can support both command and video modes and up to four virtual channels to accommodate multiple displays.
Command and video mode support (type 1, 2, 3, and 4 display architecture)
Mode switching: low power and ultra low power
Burst mode/Non-burst mode
Bus turnaround
• Fault error recovery scheme
Both DPI and DBI coexist in the system but only one of them could be active in a certain time
TM
External Use
67
MIPI CSI-2
The CSI-2 MIPI Interface is a digital core accompanied with multi-lane D-PHY that implements all protocol functions defined in the MIPI CSI-2 Specification, providing an interface between the System and MIPI CSI-2 compliant
Camera Sensor
The features of the MIPI CSI-2 complex:
Supported standard version: MIPI CSI-2
Version 1.0
Configuration: one clock lane, four data lanes
•
•
•
Speed: Up to 1Gb/s per lane
Throughput: 250MB/sec
Timing accurate signaling of Frame and
Line synchronization packets;
Support for several frame formats such as:
−
−
General Frame or Digital Interlaced Video with or without accurate sync timing
Data type (Packet or Frame level) and
Virtual Channel interleaving
−
−
32-bit Image Data Interface delivering data formatted as recommended in CSI-2
Specification;
−
−
Directly supports all primary data formats conversion to IPU input. Some secondary formats are treated as “generic” data
RGB, YUV and RAW color space definitions;
From 24-bit down to 6-bit per pixel;
Generic or user-defined byte-based data types
TM
External Use
68
LVDS Interface in i.MX53 & i.MX6 D/Q
– Key Features
•
•
−
Structure
Two Channels
Each channel contains 4 data pairs + 1 clock pair
− Data
18 bpp pixels
– using 3 LVDS data pairs
24 bpp pixels
– using 4 LVDS data pairs
− Control signals: HSYNC, VSYNC, DE
Pixel clock rate
− Single Channel: up to 85 MHz; e.g. WXGA @ 60 fps or 720p60
− Dual Channel: up to 170 MHz; e.g. UXGA @ 60 Hz or 1080p60
Relevant Standards
− PHY Standard: ANSI EIA-644A
− Display Protocol Standards:
SPWG Standard Panel Working Group Specification 3.8 (May 2007)
VESA PSWG
– Panel Standardization Working Group – set of standards for panels using LVDS.
JEIDA/JEITA DISM Standard JEIDA-59-1999
OpenLDI (National)
– Revision 0.95 13/May/1999. *Only* Unbalanced operating mode supported (aligned with vast majority of LCD vendors).
TM
External Use
69
LVDS
– what is supported?
•
Single Channel configuration
−
Pixel clock: up to 85 MHz; e.g. WXGA @ 60 fps or 720p60
− LVDS Clock frequency = Pixel clock x 7 = 85*7 = 595Mhz
−
Data :
18 bpp pixels
– using 3 LVDS data pairs
24 bpp pixels
– using 4 LVDS data pairs
•
Dual Channel configuration
− Pixel clock: up to 170 MHz; e.g. UXGA @ 60 Hz or 1080p60
− LVDS Clock frequency = Pixel clock x 7/2 = 170*7/2 = 595Mhz
−
Data :
18 bpp pixels
– using 3 LVDS data pairs per channel
24 bpp pixels
– using 4 LVDS data pairs per channel
TM
External Use
70
LDB Features
•
LDB Structure:
•
− 2 Channels , same/independent data
− Each channel contains 4 data pairs + 1 clock pair
Resolutions/Rates:
•
− Single Channel (up to WXGA): Up to 85 MHz, 3 or 4 data pairs
− Dual Channel (up to UXGA): Up to 170 MHz, 6 or 8 data pairs
− For example: can support 1080p60 or UXGA @60fps
Pixel Depths:
•
− 18 bpp
– 3 LVDS data pairs
−
24 bpp
– 4 LVDS data pairs
Control signals:
− Supports HSYNC, VSYNC, DE
TM
External Use
71
HDMI General Features
•
Description: High-Definition Multimedia Interface (HDMI)
Transmitter including both HDMI TX Controller and PHY
• Standard Compliance: HDMI 1.4a, DVI 1.0, HDCP 1.4 (with keys stored in embedded eFuses)
− Supporting majority of primary 3D Video formats
•
TMDS Core Frequency: From 25 MHz to 340 MHz
• Consumer Electronic Control: Supported
• Monitor Detection: Hot plug/unplug detection and link status monitor support
•
Testing Capabilities: Integrated test module
•
Maximal Power Consumption: 70mW
• Temperature Range: -40C to +125C (Tj)
TM
External Use
72
i.MX6 Dual/Quad: HDMI Video/Audio Features
•
Video Standard Compliance: EIA/CEA-861D
• Supported Video Resolutions: Up to 1080p@60Hz and
720p/1080i@120Hz HDTV display; up to QXGA graphics display
•
Pixel Clock Frequency: From 25 MHz to 240 MHz
•
Video Data Formats: YCbCr 4:4:4; RGB 4:4:4; YCbCr 4:2:2
• Internal Video Processing: Interpolation YCbCr 4:2:2 to 4:4:4; conversion YCbCr to RGB and vice versa
•
Audio Standard Compliance: IEC60958, IEC61937
•
Supported Audio Formats: All audio formats as specified by the
HDMI Specification Version 1.4a
• Audio Input Interfaces: Embedded Audio DMA
•
Audio Sampling Rate: Up to 192 kHz
TM
External Use
73
IPU & the iMX Linux BSP
TM
External Use
74
IPU Drivers
•
•
IPU drivers are based on code re-used from MX5x IPU driver
•
Key modification: Support for multiple instances provides support for the 2
IPU modules in i.MX 6Quad/Dual
IPU functionality accessed through multiple interfaces:
•
•
•
•
IPU framebuffer (FB) driver: Accessed through the Linux standard FB interface
• Introduction of MXC Display Driver framework, to manage interaction between IPU and display device drivers (e.g., LCD, LVDS, HDMI, MIPI, etc.)
IPU processing driver: A custom API exposes IPU processing functionality
•
•
•
•
•
Resizing
Rotation
Combining of graphics planes
CSC
De-interlacing
Video 4 Linux 2 (V4L2) output driver: Based on V4L2 video API, leverages IPU processing driver
V4L2 capture driver: Based on V4L2 capture API; leverages IPU processing driver and IPU core driver
TM
External Use
75
IPU Drivers
•
Architecture Design:
•
IPU sub-module (CSI, IC, DI, DP, IDMAC, etc) functionality provided in a set of IPU Core driver functions
• Largely unmodified between MX5x and MX 6Quad
•
Leverage existing Linux APIs:
• For framebuffer access (IPU FB driver)
• Video image processing and display (V4L2 output driver)
• Image capture (V4L2 capture driver)
• Fill in gaps with custom APIs and API extensions:
• Extensions to FB interface to control certain IPU functionality
– local and global alpha, gamma correction, etc.
• IPU Processing driver to provide user space access to IPU processing capabilities
• Provide MXC Display Driver (mxc_dispdrv.h) framework to simplify connection between display devices and IPU modules.
TM
External Use
76
IPU Drivers
Kernel Core Software
Freescale BSP Software
Hardware
User space
V4L2 Soc-Camera
Subsystem
Cameras mxc v4l2 capture
V4L2 Video
Framework mxc v4l2 output
IPU Processing Driver
IPU Core Driver (CSI/IDMA/IC/DC/DP)
Hardware
Frame Buffer Core
IPU FB driver
Display Device
Drivers
(LCD, LVDS,
HDMI, etc)
MXC Display
Driver Framework
TM
External Use
77
Overview of IPU Drivers
•
MXC Display Driver
− Simple framework to manage MXC display device drivers.
− Examples: LCD, TVE, MIPI, VGA, HDMI
• IPU Processing Driver
− Manage IPU IC tasks in kernel space.
•
MXC V4L2 Drivers
− Based on IPU processing driver.
TM
External Use
78
MXC Display Driver - Files
•
MXC Display Driver files
drivers/video/mxc/mxc_dispdrv.h
•
drivers/video/mxc/mxc_dispdrv.c
IPU framebuffer driver
drivers/video/mxc/mxc_ipuv3_fb.c
•
Display device drivers drivers/video/mxc/mxc_lcdif.c drivers/video/mxc/mxc_hdmi.c drivers/video/mxc/mipi_dsi.c
…
TM
External Use
79
MXC Display Driver - Structures
struct mxc_dispdrv_driver { const char *name; int (*init) (struct mxc_dispdrv_handle *, struct mxc_dispdrv_setting *); void (*deinit) (struct mxc_dispdrv_handle *);
/* display driver enable function for extension */ int (*enable) (struct mxc_dispdrv_handle *, struct fb_info *);
/* display driver disable function, called at early part of fb_blank */ void (*disable) (struct mxc_dispdrv_handle *, struct fb_info *);
/* display driver setup function, called at early part of fb_set_par */ int (*setup) (struct mxc_dispdrv_handle *, struct fb_info *fbi);
}; struct mxc_dispdrv_setting {
/*input-feedback parameter*/
struct fb_info *fbi;
int if_fmt;
int default_bpp;
char *dft_mode_str;
/*feedback parameter*/
int dev_id;
int disp_id;
};
TM
External Use
80
MXC Display Driver - Functions
struct mxc_dispdrv_entry *mxc_dispdrv_register( struct mxc_dispdrv_driver *drv); int mxc_dispdrv_unregister(struct mxc_dispdrv_entry *entry); struct mxc_dispdrv_handle *mxc_dispdrv_gethandle(char *name, struct mxc_dispdrv_setting *setting); int mxc_dispdrv_setdata(struct mxc_dispdrv_entry *entry, void *data); void *mxc_dispdrv_getdata(struct mxc_dispdrv_entry *entry);
TM
External Use
81
MXC Display Driver
– Configuration Flow
LDB
RGB666
LDB-XGA
Ipu0 di1
4
fb_add_videomode
fb_set_var
1
mxc_dispdrv_register
mxc_dispdrv_setdata
3
init mxc_dispdrv
List
2 mxc_dispdrv_gethandle
IPUv3 fb driver dev=ldb mode_str=? fbi
TM
External Use
82
MXC Display Driver
•
Command Line Options: first display: video=mxcfb0:dev=dispdrv_name,mode_str,if=if_fmt second display: video=mxcfb1:dev=dispdrv_name,mode_str,if=if_fmt
For example: video=mxcfb0:dev=hdmi,1920x1080M@60,if=RGB24 video=mxcfb1:dev=ldb,LDB-XGA,if=RGB666 video=mxcfb2:dev=lcd,800x480M@55,if=RGB565
TM
External Use
83
MXC Display Driver
– Multi-Display Options
hdmi + lvds
video=mxcfb0:dev=hdmi,1920x1080M@60,if=RGB24
video=mxcfb1:dev=ldb,LDB-XGA,if=RGB666
lvds + lvds
video=mxcfb0:dev=ldb,LDB-XGA,if=RGB666
video=mxcfb1:dev=ldb,LDB-XGA,if=RGB666
lcd + lvds
video=mxcfb0:dev=lcd, 800x480M@55 ,if=RGB565
video=mxcfb1:dev=ldb,LDB-XGA,if=RGB666 hdmi + lvds + lvds
video=mxcfb0:dev=hdmi, 1920x1080M@60 ,if=RGB24
video=mxcfb1:dev=ldb,LDB-XGA,if=RGB666
video=mxcfb2:dev=ldb,LDB-XGA,if=RGB666
TM
External Use
84
Example of MXC Display Driver - HDMI
•
Software Architecture:
− HDMI multifunction driver (MFD) manages software resources common to video and audio drivers
− Audio driver uses ALSA/SoC audio framework.
− Video driver:
MXC Display Driver API to register with IPU FB driver
Linux Framebuffer (FB) API to change the video mode and receive notifications of mode changes
TM
External Use
85
HDMI and the i.MX 6Quad Framebuffer and Display Device
Architecture
TM
External Use
86
Overview of IPU Drivers
•
MXC Display Driver
− Simple framework to manage MXC display device drivers.
− Examples: LCD, TVE, MIPI, VGA, HDMI
• IPU Processing Driver
− Manage IPU IC tasks in kernel space.
•
MXC V4L2 Drivers
− Based on IPU processing driver.
TM
External Use
87
IPU Processing Driver - Introduction
•
•
•
•
•
•
•
•
Each IPU has two kernel threads for IC task PP&VF
Each kernel thread performs the tasks on its task queue list
Each task executes the following sequence: ipu_init_channel → ipu_init_channel_buffer → request_ipu_irq → ipu_enable_channel → wait_irq (task finish) → ipu_disable_channel → ipu_uninit_channel
Tasks are based on single buffer mode
Split mode tasks will be split into 2 tasks per IPU.
An application only needs to prepare a task and queue it
Task operations include
Setting the task input/overlay/output/rotation/deinterlacing/buffer
Call ioctl IPU_CHECK_TASK first to adjust parameters according to feedback
Call ioctl IPU_QUEUE_TASK to queue task
IPU_QUEUE_TASK is a blocking ioctl.
TM
External Use
88
IPU Processing Driver
•
Files
include/linux/ipu.h
drivers/mxc/ipu3/ipu_device.c
• Structures see include/linux/ipu.h
•
Ioctls
#define IPU_CHECK_TASK _IOWR('I', 0x1, struct ipu_task)
#define IPU_QUEUE_TASK _IOW('I', 0x2, struct ipu_task)
#define IPU_ALLOC _IOWR('I', 0x3, int)
#define IPU_FREE _IOW('I', 0x4, int)
TM
External Use
89
IPU Processing Driver
•
Example linux-test/test/mxc_ipudev_test/mxc_ipudev_test.c
TM
External Use
90
IPU Processing Driver
•
Advantages
− Easy to use.
− Provides workaround for IPU suspend/resume issue
Cannot suspend when double buffering is enabled
− All IC tasks may be based on this IPU processing driver, including
− user applications and V4L2 output and capture drivers
Reason: Easier to debug if there is an issue.
• Disadvantages
− Based on kernel thread, all control is done by software, so Linux scheduler may have undesirable impact on the system.
− Single-buffer mode doesn’t perform as well as double-buffer mode
TM
External Use
91
Overview of IPU Drivers
•
MXC Display Driver
− Simple framework to manage MXC display device drivers.
− Examples: LCD, TVE, MIPI, VGA, HDMI
• IPU Processing Driver
− Manage IPU IC tasks in kernel space.
•
MXC V4L2 Drivers
− Based on IPU processing driver.
TM
External Use
92
V4L2
– Common Kernel API
• What is Video4Linux (V4L)?
− V4L is the original video capture/overlay API of the Linux kernel. It appeared late the 2.1.x development cycle in the Linux kernel.
• What about V4L2?
− V4L2 is the second generation of the video4linux API which fixes a number of design bugs of the first version. It was integrated into the standard kernel in 2.5.x.
−
−
V4L2 is an interface for analog radio, video capture and output drivers.
−
Hardware acceleration capabilities (IPUv3) are leveraged in V4L2 drivers and provided in the Linux BSP
Upper level software that uses the V4L2 API, such as G-streamer source/sink and Android camera HAL, does not need to understand the underlying hardware.
• Documentation / Web Resource / API spec
−
−
Documentation/video4linux/ subdirectory in kernel tree. http://v4l2spec.bytesex.org
- the spec of V4L2
− http://www.linuxtv.org/wiki/index.php/Main_Page - wiki for V4L and DVB
TM
External Use
93
MXC V4L2
– User APIs
−
−
−
−
−
−
−
−
−
VIDIOC_QUERYCAP
VIDIOC_G_FMT / VIDIOC_S_FMT
VIDIOC_REQBUFS
VIDIOC_QUERYBUF
VIDIOC_QBUF / VIDIOC_DQBUF
VIDIOC_STREAMON / VIDIOC_STREAMOFF
VIDIOC_G_CTRL / VIDIOC_S_CTRL
VIDIOC_CROPCAP / VIDIOC_G_CROP / VIDIOC_S_CROP
VIDIOC_ENUMOUTPUT / VIDIOC_G_OUTPUT / VIDIOC_S_OUTPUT
APIs used only for MXC V4L2 capture:
−
VIDIOC_ENUMINPUT / VIDIOC_G_INPUT / VIDIOC_S_INPUT
APIs used only for MXC V4L2 TV-in:
−
VIDIOC_ENUMSTD / VIDIOC_G_STD / VIDIOC_S_STD
TM
External Use
94
MXC V4L2
– Internal APIs
New version of V4L2 framework supports master / slave device drivers:
− Support multiple master devices and multiple slave devices
− mxc_v4l2_capture driver is the V4L2 master driver
− Camera drivers and tv-in driver are V4L2 internal slave drivers
These ioctls are used in kernel internally for MXC V4L2 capture/tvin especially:
− ioctl_dev_init / ioctl_dev_exit
− ioctl_s_power
− ioctl_g_ifparm
− ioctl_init
− ioctl_g_fmt_cap
− ioctl_g_parm / ioctl_s_parm
− ioctl_queryctrl / ioctl_g_ctrl / ioctl_s_ctrl
TM
External Use
95
V4L2
– Usage and Examples
• How to use V4L2:
Generally, programming a V4L2 device consists of these steps:
−
−
−
−
−
−
Opening the device
Changing device properties, selecting a video and audio input, video standard, picture brightness, etc.
Negotiating a data format
Negotiating an input/output method
Executing the actual input/output loop
Closing the device
•
Please refer to the following examples:
−
BSP team’s test cases: git clone git://sw-git01-tx30.am.freescale.net/linux-test.git
(without password)
−
Test team’s VTE test cases
−
−
Camera HAL in Android source code
G-streamer source/sink source code
TM
External Use
96
Features of MXC V4L2 Capture
• Still capture: capture a frame in a buffer, users can read the buffer and store the picture in a file.
• Preview: show the capture frames directly onto the framebuffer. Users can choose the framebuffer number on which the video will be shown.
•
Video capture: capture frames in allocated buffers. Users can get the frames by calling VIDIOC_DQBUF and then send them to VPU for encoding or save them in a file.
TM
External Use
97
Features of MXC V4L2 Capture
•
To capture a still image
• No resizing/rotation/CSC can be done.
−
One image can be converted to be in a different pixel format within the same color space with the raw data by using CSI->SMFC->MEM IDMAC channel.
TM
External Use
98
Features of MXC V4L2 Capture
•
To preview a captured video on frame buffer
−
Preview on fb0
– Resizing/rotation/CSC can be done in IC(PRP_VF) channels. Manually control the buffer ready flags in interrupt handler. Use DP(DP_BG) channel to display the captured frames.
− Preview on fb2 - Resizing/rotation can be done in IC(PRP_VF) channels. CSC can be done in DP(DP_FG) channel. The flow is totally controlled by FSU.
TM
External Use
99
Features of MXC V4L2 Capture
•
•
•
•
To capture frames in allocated buffers (using IC channel)
Resizing/rotation/CSC can be done in IC(PRP_ENC) channels. Manually control the buffer ready flags in interrupt handler.
MXC V4L2 capture maintains the numbers of buffers.
Users can get the captured buffer by calling
VIDIOC_DQBUF ioctl and return the buffer to the kernel by calling
VIDIOC_DQBUF ioctl
.
NOTE: Camera preview and capturing frames into buffers can be used at the same time.
TM
External Use
100
Features of MXC V4L2 Capture
•
•
•
To capture frames in allocated buffers (using SMFC channel)
No resizing/rotation/CSC can be done. Manually control the buffer ready flags in interrupt handler.
MXC V4L2 capture maintains numbers of buffers.
Users can get the captured buffer by calling kernel by calling
VIDIOC_QBUF ioctrl
.
VIDIOC_QBUF ioctrl and return the buffer to the
TM
External Use
101
Features of MXC V4L2 Output
•
Support for playing video using one framebuffer at a time:
− DP-BG framebuffer
− DP-FG framebuffer
− DC framebuffer
•
Support for the following modes:
− IC normal mode
– resizing / CSC / rotation, using PP channel
− IC bypass mode
– CSC, using DP or DC channel directly
− IC horizontal/vertical split mode
– resizing / CSC / rotation, support high resolution output, using PP channel
− VDI-IC video deinterlacing mode - deinterlacing / resizing / CSC / rotation, using PRP_VF channel, including high motion mode and low motion mode.
Note: V4L2 output and V4L2 capture can run at the same time if there is no IC or DP/DC channel conflict.
TM
External Use
102
Features of MXC V4L2 Output
•
•
•
IC normal mode (using PP channel)
Resizing/rotation/CSC. Manually control the IC output/display input buffer ready flags in interrupt handler and control IC input buffer ready flags in timer handler.
MXC V4L2 output maintains numbers of buffers.
Users can show the buffer on one framebuffer by calling
VIDIOC_DQBUF
ioctrl and return the buffer to the kernel by calling
VIDIOC_DQBUF
ioctrl.
TM
External Use
103
Features of MXC V4L2 Output
•
•
•
IC bypass mode (using DP or DC channel)
CSC can be done, but no resizing or rotation can be done. Manually control the display output buffer ready flags in the interrupt handler.
MXC V4L2 output maintains numbers of buffers.
Users can show the buffer on one
VIDIOC_QBUF
VIDIOC_QBUF ioctrl.
TM
External Use
104
Features of MXC V4L2 Output
•
•
•
•
IC horizontal split mode (using PP channel)
Resizing/rotation/CSC. Manually control the IC output/IC input (right stripe)/display input buffer ready flags in the interrupt handler and control IC input (left stripe) buffer ready flags in the timer handler.
MXC V4L2 output maintains the number of buffers.
Users can show the buffer on one framebuffer by calling
VIDIOC_QBUF
ioctrl and return the buffer to kernel by calling
VIDIOC_QBUF
ioctrl.
TM
External Use
105
Features of MXC V4L2 Output
•
•
•
•
VDI-IC video deinterlacing mode (using PRP_VF channel)
Resizing/rotation/CSC can be done. Manually control the IC output/display input buffer ready flags in interrupt handler and control IC input buffer ready flags in timer handler.
MXC V4L2 output maintains the numbers of buffers.
Users can show the buffer on one framebuffer by calling VIDIOC_DQBUF ioctrl and return the buffer to the kernel by calling VIDIOC_QBUF ioctrl.
TM
External Use
106
How do we integrate IPUv3 into MXC V4L2?
•
•
•
•
•
•
Based on analysis of the IPUv3 spec
What channel should we use for the framebuffer?
What channel should we use for V4L2 capture and V4L2 output?
IPU low-level API design
– enable/disable channel, init/unit channel, init channel buffer, interrupt handler register interface…
Invoke IPU low-level APIs from the MXC V4L2 driver.
Ensure backwards compatibility in the IPU low-level APIs in cases where the hardware has not changed dramatically.
TM
External Use
107
Use case Examples & Tips
TM
External Use
108
IPUv3 tips
•
Use VDOA
− For more efficient DDR access pattern
• Refresh the display at the rate of the content
− For displays the perform frame rate conversion.
− Sometimes called 24P cinema
− Significantly reduces the amount of data read by IPU.
TM
External Use
109
IPUv3 tips
•
Buffer management
− IPU write channel needs a free buffer in the DDR to start writing data.
−
If there’s no free buffer IPU’s internal FIFOs are filled, causing additional latencies
−
Buffer management system should guarantee that there’re always a free buffer for IPU’s usage.
−
IPU can start writing the data to that free buffer immediately avoiding unnecessary
TM
External Use
110
IPUv3 tips
•
Move load from the IC
− Perform CSC (Color Space Conversion), in DP (Display Processor), and not in the IC. (Save memory bandwidth, and lower load on the IC).
− Move combining tasks to the VDIC (if not used as de interlacer)
− Consider the IC processing speed, for the tasks
Resize
– 2 cycles/pixel
Combine
– 2 cycles/pixel
CSC
– 3 cycles/pixel
• Flipping an image (a.k.a 180º rotation)
− Use H-flip and V-flip transfers, done by IDMAC and IC, and not using the
IRT module.
TM
External Use
111
IPUv3 tips- Optimizing memory accesses
• Optimize Pixel formats
• The larger the chunks of data are
– the easier it is on the DDR
• The smaller amount of bursts
– better for the memory bus system
•
Choose the mode that works best for the specific use case and avoid the rest
Format Amount of data per macro block
Burst size DDR3 x64 BL Amount of bursts per macro block
Target
16 Best:
IPU => IPU;
VDOA => IPU
YUV422 interleaved
256 bytes 16 bytes 2
YUV422 partial interleaved
YUV422 non interleaved
YUV420 interleaved
YUV420 partial interleaved
(NV12)
YUV420 non interleaved
256 bytes
256 bytes
256 bytes
192 bytes
192 bytes
TM
8 bytes + 8 bytes
8 bytes + 4 bytes + 4 bytes
16 bytes
8 bytes + 8 bytes
8 bytes + 4 bytes + 4 bytes
External Use
112
1 + 1
1 +1+1
2
1 + 1
1 +1+1
32
48
16
32
48
Best: VPU => VDOA
(decode)
Best: IPU => VPU
(encode)
IPUv3M tips
•
How to work efficiently with the memory system
− Use real time channels
Marking IPU accesses with an AXI ID to bypass the PL301’s arbitration
− Lock feature
issue a series of IPU bursts the belong to the same channel
– better chance for
DDR hit
− Conditional read
If an alpha mask is provided to the overlay plane transparent pixels are not read from memory.
TM
External Use
113
IPUv3 tips
•
Recommended Display Connectivity i .MX51
IPU_DISP1 port
DISP1_DAT0
DISP1_DAT1
DISP1_DAT2
DISP1_DAT3
DISP1_DAT4
DISP1_DAT5
DISP1_DAT6
DISP1_DAT7
DISP1_DAT8
DISP1_DAT9
DISP1_DAT10
DISP1_DAT11
DISP1_DAT12
DISP1_DAT13
DISP1_DAT14
DISP1_DAT15
DISP1_DAT16
DISP1_DAT17
DISP1_DAT18
DISP1_DAT19
DISP1_DAT20
DISP1_DAT21
DISP1_DAT22
DISP1_DAT23
DI1_PIN2
DI1_PIN3
DI1_PIN15
DI1_DISP_CLK
24-bi t RGB
B0
B1
B2
B3
B4
B5
B6
B7
G0
G1
G2
G3
G4
G5
G6
G7
R0
R1
R2
R3
R4
R5
R6
R7
R3
R4
R5
G5
R0
R1
R2
G1
G2
G3
G4
B3
B4
B5
G0
RGB666
B0
B1
B2
R0
R1
R2
R3
R4
G2
G3
G4
G5
B3
B4
G0
G1
RGB565
B0
B1
B2
HSYNC
VSYNC
DRDY
CLK
R1
R2
R3
R4
G2
G3
G4
R0
B3
B4
G0
G1
RGB555
B0
B1
B2
Cr3
Cr4
Cr5
Cr6
Cr7
Cb7
Cr0
Cr1
Cr2
Y7
Cb0
Cb1
Cb2
Cb3
Cb4
Cb5
Cb6
Y3
Y4
Y5
Y6
24-bi t YCbCrYCbCr4:4:4
Y0 Y0
Y1
Y2
Y1
Y2
Y3
Y4
Y5
Y6
Y7
Cb0
Cb1
Cb2
Cb3
Cb4
Cb5
Cb6
Cr3
Cr4
Cr5
Cr6
Cr7
Cb7
Cr0
Cr1
Cr2
TM
External Use
114
IPUv3 - debug
•
IPU error interrupts & status bits
− IPU errors are reported on the IPU_INT_STAT_5, IPU_INT_STAT_6,
IPU_INT_STAT_9 and IPU_INT_STAT_10 registers. The 1st debug step should be inspecting these bits
A flickering display is normally a result of a system bus load (DDR).
These will be reported as “new frame before end of frame error” on
IDMAC_NFB4EOF register.
Bus loads that causes errors on the CSI side will be reported on *FRM_LOST* status bits
Some of IPU internal signals can be routed to pins and measured using the IPU diagnostics unit. These can be used to capture errors/interupts and track internal flows. (the IOMUX needs to be configured to output the ipu_diagbus signals)
TM
External Use
115
IPUv3 - debug
•
IPU diagnostics unit
− Some of IPU internal signals can be routed to pins and measured using the IPU diagnostics unit.
− These can be used to capture errors/interupts and track internal flows.
− The IOMUX needs to be configured to output the ipu_diagbus signals.
• Task status and flow control
− A frozen display is sometimes a result of wrong control of the buffer management within the IPU.
− The status of each flow controlled by the FSU can be monitored using the TASKS_STAT status registers.
− In some cases a user may track the BUF_RDY and CUR_BUF indications of the flow to track the flow.
TM
External Use
116
Dual video-in use case example
YUV, 20Hz
ITU 656
Rear-View
Cam
Vid in
YUV
IPU
IC
(VF)
Inverted
Inverted
Vid in
RGB
RGB, 60Hz
IC
(PP - copy)
Bypass path, depending on needs
Temp is needed for cases BG freq > Vid one
Memory
2x
Temp
Background
3x
DP
DISPLAY
Instrumental
Layer
Inverted, 60Hz ?
GPU
Inverted, 60Hz ?
TM
External Use
117
Playback, HD1080p H.264 HP
–> Display
IPU
CSI
DISPLAY
VDI
IC
Memory
Video
720p
YUV
4:2:0
WXGA
YUV
4:2:2
GUI
RGBA
8888
DC/DI
60 Frames per sec
30 Frames per sec
TM
External Use
118
DP
Dual Playback, HD720p H.264 HP
–> WSVGA Display
IPU
Memory
CSI
VDI
Video
720p
YUV
4:2:0
IC
WSVGA
YUV
4:2:2
RGB
888
DC/DI DP
GUI
RGBA
8888
60 Frames per sec
30 Frames per sec
TM
External Use
119
Demo
TM
External Use
120
Q & A
TM
External Use
121
www.Freescale.com
© 2014 Freescale Semiconductor, Inc. | External Use
TM
advertisement
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project