Database Systems
7. presentation
State of the art
Bence Molnár
Distributed
Databases
Distributed systems
—
Systems are not attached to a distinct device but to
several networked devices
—
Requirements
—
—
—
High speed network
—
Decreasing prices and increasing speed of CPUs
Why they are applied?
—
Economical (vs. Supercomputer)
—
Huge computation capacity
—
Increased reliability
—
Join different services (SOA)
Challenges, solutions
Distributes databases
—
Data stored in multiple physical computers (co located or
different physical location), however logically integrated and
consistent
—
Pros:
—
—
Reducing communication costs
—
Available even if a node fails (robustness)
—
Modular design, flexible configuration (scalability)
—
Easier maintenance
Cons:
—
Complex system
—
Multiple hardware and software solutions
—
Complicated user (permission) management
Databases in cloud
Cloud
—
Large groups of remote servers are networked to allow
centralized data storage and online access to computer
services or resources
—
Service models:
IaaS (Infrastructure as a service)
—
—
—
—
—
Windows Azure, Google AppEngine, Cloud Foundry
SaaS (Software as a service)
Google Apps, Facebook, Microsoft Office 365, OnLivelarge
Dropbox, Google Drive?
—
—
Amazon EC2, Windows Azure VM, Google Compute
Engine
PaaS (Platform as a service)
Cloud
Cloud
Databases in cloud
— DB
on virtual server (VPS, Virtual Private
Server)
—
Oracle
DB,
CouchDB,...
— DB
—
PostgreSQL,
MySQL,
as a service
Amazon Dynamo, Google
Store, Microsoft SQL Azure
App
Engine
Accessing databases
Standard drivers
— Goal:
managing databases in OS and DB
independent way
— drivers
— File
based data is accessible as well (e.g.
CSV, XLS, etc...)
— ODBC
(Open Database Connectivity): MS
supported
— JDBC
— FDO
(Java Database Connectivity)
(Feautre Data Objects)
Standard drivers
C/C++
Matlab, PHP, Ruby, ...
Java, .NET, ...
Driver (ODBC, JDBC, FDO, ...)
PostgreSQL,
MySQL, ...
Microsoft Jet
(Access)
CSV, other files...
Spatial databases
(FDO)
Accessing DB in Matlab
— Database
— Support
— Tables
Toolbox
for ODBC & JDBC
↔ Matrices (equivalent)
— Database
Explorer App
Access DB in Matlab
(JDBC)
% 1. JDBC download driver, eg.: PostgreSQL:
http://jdbc.postgresql.org/download.html
% 2. Add JAR file to classpth.txt
% 3. Set connection timeout (optional)
Logintimeout(5);
% 4. Set returned data type
setdbprefs('DataReturnFormat','cellarray');
% 5. Connecting to database
connA=database('database', 'username', 'password',...
'org.postgresql.Driver', 'jdbc:postgresql://localhost/');
% 6. Validate connection (optional)
ping(connA);
Accessing DB in Matlab
(JDBC)
% 7. Run query
selCols = ['packetid, b0, b1, b2, b3, b4, b5, b6'];
cursorA=exec(connA, [' select ' selCols ' from exp1']);
% 8. Fetch results into objects (cell)
% cursorA=fetch(cursorA, 10);
cursorA=fetch(cursorA);
% 9. Accessing data
DataMat = cursorA.Data;
% 10. close cursor and connection (release resources)
close(cursorA);
close(connA);
Semi-structured
databases
Properties
— Data
and schema are not separated
— Pros:
—
Schema doesn't locks the information
—
Flexible format: easy to modify the
schema
—
Portable data transfer
— Queries
— E.g.:
are less efficient compared to SQL
OEM (Object Exchange Model), XML
(Extnesible Markup Language)
XML
— Standard
— Format:
—
Tag: <something></something>
—
Self closing tag: <something/>
—
Tags might be nested but not overlapped e.g.:
<something1> <something2> </something2>
</something1>
—
Single root element
—
XML declaration, processing commands and comments
—
XML Schema: XSD
—
XHTML
XML
Example:
<?xml version="1.0" encoding="UTF-8"?>
<Recipes name="bread" preparing_time="5 min" cooking_time="3 hours">
<title>Simple bread</title>
<ingredient quantities="3" unit="cup">Flour</ingredient>
<ingredient quantities="10" unit="decagramme">Yeast</ingredient>
<ingredient quantities="1.5" unit="cup">Warm water</ingredient>
<ingredient quantities="1" unit="teaspoon">Salt</ingredient>
<Commands>
<step>Mix all ingredients together, then knead well!</step>
<step>Cover with a cloth and let rest for an hour in a warm room!
</step>
<step>Knead again, put it in a tin pan, then bake it in the oven!
</step>
</Commands>
</Recipes>
XQuery
— Query
languages
— XPATH
— FLWOR
expressions:
—
FOR $var IN exp_sequence_nodes
—
LET $var_single_value := exp_values
—
WHERE exp_condition
—
ORDER BY exp_order
—
RETURN exp_result
XQUERY példa
for $product in
doc("catalog.xml")/catalog/product
let $name := $product/name
where $product/@dept = "ACC"
order by $name
return $name
Document oriented databases
— Storing
document
— Standard
format
XML, JSON, etc...
—
binary: PDF, MS Office, etc...
— Every document has a unique identifier (e.g.:
URI)
—
References
—
http://en.wikipedia.org/wiki/Distributed_computing
—
http://en.wikipedia.org/wiki/Distributed_database
—
http://en.wikipedia.org/wiki/Cloud_computing
—
http://en.wikipedia.org/wiki/Virtual_private_server
—
http://en.wikipedia.org/wiki/Semi-structured_data
—
http://en.wikipedia.org/wiki/XML
—
http://en.wikipedia.org/wiki/XQuery
—
http://en.wikipedia.org/wiki/FLWOR
Thank You!