SUSE Enterprise Storage powered by Ceph SUSE

SUSE Enterprise Storage powered by Ceph
SUSE
Tom D’Hont
#opentechday
#suse
Ceph
About
• Scale-out
• Object store
• Multiple interfaces
• Open source
• Community based
• Common hardware
• Self-healing & managing
https://ceph.com
#opentechday #suse
Ceph
Community
Code developers
782
Core Regular Casual
22
53
705
Total downloads
160,015,454
Unique downloads
21,264,047
#opentechday #suse
Ceph
Object Storage Daemon (OSD)
Object Storage Daemon
File system (xfs) or BlueStore
Physical disk, other persistent storage device
• OSDs serve storage objects to clients
• Peer to perform replication and recovery and scrubbing
• Journal often stored on faster media like SSD (often shared)
#opentechday #suse
Ceph
Storage node
Put several OSDs in one Storage Node
#opentechday #suse
Ceph
Monitor node
• Monitors are the brain cells of the cluster
- Cluster membership (cluster map)
- Consensus for distributed decision making
• Not in the performance path
- Do not serve stored objects to clients
#opentechday #suse
Ceph
Reliable Autonomous Distributed Object Store Cluster
#opentechday #suse
Ceph
RADOS Interfaces
#opentechday #suse
Ceph
Replication options
Full copies of stored objects
One copy plus parity
- Very high durability
- Cost-effective durability
- 3x (200% overhead)
- 1.5x (50% overhead)
- Quicker recovery
- Expensive recovery
#opentechday #suse
Ceph
CRUSH placement algorithm
Pseudo-random data placement algorithm
- Fast calculation, no lookup table
- Repeatable, deterministic
- Statistically uniform distribution
CRUSH uses a map of OSDs in the cluster
- Includes physical topology, like row, rack, host
- Includes rules describing which OSDs to consider
#opentechday #suse
Ceph
Placement Group (PG)
• Balance data across OSDs in the cluster
• One PG typically exists on several OSDs for
replication
• One OSD typically serves many PGs
#opentechday #suse
Ceph
Placement Group (PG)
• Each placement group maps to a pseudo random
set of OSDs
• When an OSD fails, replication generally involves
all OSDs in the pool replicating PGs across all
OSDs in the pool
• Massively parallel recovery
#opentechday #suse
Ceph
Pools
• Logical container for storage objects
• Number of replicas OR erasure encoding settings
• Number of placement groups
Pool operations
- Create object
- Remove object
- Read object
- Write entire object
- Snapshot of the entire pool
#opentechday #suse
Ceph
Cache tiered pools
#opentechday #suse
Ceph !
But why SUSE?
#opentechday #suse
Ceph !
But why SUSE?
1st
25+
Years of Open Source
Engineering
Experience
Enterprise
OpenStack
Distribution
+8%
SUSE Growth vs.
Other Linux in
2015*
Top 15
1.4B
Worldwide System
infrastructure
Software Vendor
Annual
Revenue
10
Awards in
2016 for SUSE
Enterprise
Storage
2/3+
Of the Fortune Global
100 use SUSE Linux
Enterprise
50%+
1st
Development
Engineers
Enterprise
Linux
Distribution
#opentechday #suse
SUSE Enterprise Storage
Enable transformation
This is where you probably are today
This is where you need to get to
Mode 1 – Gartner for Traditional
Mode 2 – Gartner for Software Defined
Legacy Data Center
Software-defined Data Center
- Network, compute and storage silos
- Software-defined everything
- Traditional protocols – Fibre Channel, iSCSI, CIFS/SMB,
NFS
Agile Infrastructure
Process Driven
- Supporting a DevOps model
- Business driven
- Slow to respond
Support today’s investment
Adapt to the future
#opentechday #suse
SUSE Enterprise Storage
Use cases
Bulk Storage
Video Surveillance
Data Archive
Virtual Machine Storage
• SharePoint data
• Medical records
• Medical images
•
•
•
•
Long-term storage
and back up:
• HPC
• Log retention
• Tax documents
• Revenue reports
Low and mid i/o performance
for major hypervisor platforms
• kvm – native RBD
• Hyper-V – iSCSI
• VMware - iSCSI
• X-rays
• MRIs
• CAT scans
• Financial records
Security surveillance
Red light / traffic cameras
License plate readers
Body cameras for law
enforcement
• Military/government visual
reconnaissance
#opentechday #suse
SUSE Enterprise Storage
Data Capacity Utilization
Tier 0
- Ultra high performance
Tier 1
- High-value, OLTP, Revenue Generating
Tier 2
- Backup/recovery, reference data, bulk data
Tier 3
- Object archive
- Compliance archive
- Long-term retention
#opentechday #suse
SUSE Enterprise Storage 4
Major features summary
#opentechday #suse
SUSE Enterprise Storage 4
openATTIC
#opentechday #suse
SUSE Enterprise Storage
Roadmap
2016
2017
2018
V3
V4
V5
SUSE Enterprise Storage 3
SUSE Enterprise Storage 4
Built On
• Ceph Jewel release
• SLES 12 SP1 (Server)
Built On
• Ceph Jewel release
• SLES 12 SP 2 (Server)
Manageability
• Initial Salt integration (tech preview)
Manageability
• SES openATTIC management
• Initial Salt integration
Interoperability
• CephFS (Tech Preview)
• AArch64 (Tech Preview)
Availability
• Multisite object replication (Tech Preview)
• Async block mirroring (Tech Preview)
Interoperability
• AArch64
• CephFS (production use cases)
• NFS Ganesha (Tech Preview)
• NFS access to S3 buckets (Tech
Preview)
• CIFS Samba (Tech Preview)
• RDMA/Infiniband (Tech Preview)
Availability
• Multisite object replication
• Asynchronous block mirroring
Information is forward looking and subject to change at any time.
SUSE Enterprise Storage 5
Built On
•
Ceph Luminous release
•
SLES 12 SP 3 (Server)
Manageability
•
SES openATTIC management phase 2
•
SUSE Manager integration
Interoperability
•
NFS Ganesha
•
NFS access to S3 buckets
•
CIFS Samba (Tech Preview)
•
Fibre Channel (Tech Preview)
•
RDMA/Infiniband
•
Support for containers
Availability
•
Asynchronous block mirroring
•
Erasure coded block pool
Efficiency
•
BlueStore back-end
•
Data compression
•
Quality of Service (Tech Preview)
#opentechday #suse
SUSE Enterprise Storage 4
Case study
4 DC campus
480 TB
CIFS / NFS
#opentechday #suse
Existing landscape
Robocopy
Rsync
#opentechday #suse
Proposed landscape
CIFS/NFS network
10GbE
[HP DL360 Gen9] CIFS/NFS gateway / management node
2x E5-2630v3
4x 16GB PC4-2133
1x Dual 120GB SSD M.2
DL 380 Gen9
2x 10GbE T
500W R-PS
CIFS/NFS gateway
management node
DL 360 Gen9
1
3
monitoring node
UID
2
4
6
1
3
5
7
8
4
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
UID
ProLiant
DL360
Gen9
6
7
8
ProLiant
DL380
Gen9
OSD node 1
UID
UID
2
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
5
DL 160 Gen9
[HP DL160 Gen9] monitoring node
7.2K
SATA
4.0 TB
ProLiant
DL160
Gen9
ProLiant
DL380
Gen9
OSD node 2
1x E5-2603v3
DL 160 Gen9
1x 8GB PC4-2133
1x Dual 1210GB SSD M.2
monitoring node
1
3
5
2
4
6
1x 10GbE T
550W PS
UID
UID
7
8
ProLiant
DL160
Gen9
DL 160 Gen9
monitoring node
1
3
5
2
4
6
8
OSD node 3
UID
UID
7
ProLiant
DL380
Gen9
ProLiant
DL160
Gen9
ProLiant
DL380
Gen9
OSD node 4
UID
[HP DL380 Gen9] OSD node
ProLiant
DL380
Gen9
OSD node 5
2x E5-2630v3
8x 16GB PC4-2133
2x 8GB PC4-2133
1x Dual 120GB SSD M.2
2x 400GB SSD
1x 800GB SSD
12x 8TB HDD
2x 10GbE T
UID
Start with 480 TB Netto + extend with 68,6 TB
1 OSD node
12x 8 TB 96 TB
RAW
7 OSD nodes
84x 8 TB 672 TB
RAW (96x7)
Erasure code k=5, m=2 480 TB
Netto (672/7x5)
1 OSD node
k=5, m=2 68,6 TB Netto (480/7)
ProLiant
DL380
Gen9
OSD node 6
480 TB
Netto
UID
ProLiant
DL380
Gen9
OSD node 7
UID
ProLiant
DL380
Gen9
OSD node 8
Spare
capacity
800W R-PS
#opentechday #suse
Proposed Landscape
Objects
CIFS/NFS network
10GbE
1
2
3
4
5
D
1
D
2
D
3
D
4
D
5
D
1
D
2
D
3
7.2K
SATA
4.0 TB
D
2
D
3
D
4
7.2K
SATA
4.0 TB
D
3
D
4
D
5
7.2K
SATA
4.0 TB
D
4
D
5
P
1
7.2K
SATA
4.0 TB
D
5
P
1
P
2
7.2K
SATA
4.0 TB
P
1
P
2
D
1
P
2
D
1
D
2
D
1
D
2
D
3
P
1
P
2
DL 380 Gen9
1
3
5
2
4
6
1
3
5
UID
7
8
7.2K
SATA
4.0 TB
ProLiant
DL360
Gen9
7.2K
SATA
4.0 TB
DL 160 Gen9
monitoring node
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
UID
2
4
6
1
3
5
7
8
ProLiant
DL160
Gen9
7.2K
SATA
4.0 TB
DL 160 Gen9
monitoring node
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
UID
2
4
6
7
8
ProLiant
DL160
Gen9
7.2K
SATA
4.0 TB
DL 160 Gen9
monitoring node
1
3
5
2
4
6
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
UID
7
8
7.2K
SATA
4.0 TB
ProLiant
DL160
Gen9
7.2K
SATA
4.0 TB
Erasure Coding
– Think of it as software RAID for an object
– Object is broken up into ‘k’ fragments and given ‘m’ durability pieces
– k=5, m=2  RAID 6
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
D
4
D
5
D
5
P
1
P
1
P
2
P
2
7.2K
SATA
4.0 TB
D
1
7.2K
SATA
4.0 TB
D
2
D
3
D
4
7.2K
SATA
4.0 TB
UID
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
ProLiant
DL380
Gen9
OSD node 1
7.2K
SATA
4.0 TB
UID
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
ProLiant
DL380
Gen9
OSD node 2
7.2K
SATA
4.0 TB
UID
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
ProLiant
DL380
Gen9
Spare capacity
CIFS/NFS gateway
management node
7.2K
SATA
4.0 TB
DL 360 Gen9
OSD node 3
UID
ProLiant
DL380
Gen9
OSD node 4
UID
ProLiant
DL380
Gen9
OSD node 5
7.2K
SATA
4.0 TB
UID
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
ProLiant
DL380
Gen9
OSD node 6
7.2K
SATA
4.0 TB
UID
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
ProLiant
DL380
Gen9
OSD node 7
7.2K
SATA
4.0 TB
UID
7.2K
SATA
4.0 TB
7.2K
SATA
4.0 TB
ProLiant
DL380
Gen9
OSD node 8
#opentechday #suse
SUSE Enterprise Storage
Extend your scale out storage to improve resilience
1 DC
k=5, m=2
DC 1
D1
40% overhead
Failure protection:
 2 OSDs
2 DCs
k=5, m=5
DC 1
DC 2
D1
D2
100% overhead
D4
D5
P1
D4
P2
P3
D5
P4
P5
D3
Failure protection:
 5 OSDs
k=8, m=8
DC 1
DC 2
DC 3
DC 4
D1
D2
D3
D4
D5
D6
D7
D8
P1
P2
P3
P4
P5
P6
P7
P8
100% overhead
D3
D2
4 DCs
 1 Datacenter
Failure protection:
 8 OSDs
 2 Datacenters
P1
P2
#opentechday #suse
SUSE Enterprise Storage 4
Demo
#opentechday #suse
Questions
& Answers
#opentechday #suse
The End
#opentechday
#suse
SUSE
All rights reserved + general disclaimer
Unpublished Work of SUSE LLC. All Rights Reserved.
This work is an unpublished work and contains confidential, proprietary and trade secret information of SUSE LLC.
Access to this work is restricted to SUSE employees who have a need to know to perform tasks within the scope of their
assignments. No part of this work may be practiced, performed, copied, distributed, revised, modified, translated,
abridged, condensed, expanded, collected, or adapted without the prior written consent of SUSE.
Any use or exploitation of this work without authorization could subject the perpetrator to criminal and civil liability.
General Disclaimer
This document is not to be construed as a promise by any participating company to develop, deliver, or market a
product. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making
purchasing decisions. SUSE makes no representations or warranties with respect to the contents of this document, and
specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose. The
development, release, and timing of features or functionality described for SUSE products remains at the sole discretion
of SUSE. Further, SUSE reserves the right to revise this document and to make changes to its content, at any time,
without obligation to notify any person or entity of such revisions or changes. All SUSE marks referenced in this
presentation are trademarks or registered trademarks of Novell, Inc. in the United States and other countries. All thirdparty trademarks are the property of their respective owners.
Download PDF
Similar pages