Object-relational mapping with SqueakSave - Hasso-Plattner

Object-relational mapping with SqueakSave - Hasso-Plattner
Object-Relational Mapping with SqueakSave
Thomas Kowark
Robert Hirschfeld
Michael Haupt
Hasso-Plattner-Institute
University of Potsdam, Germany
{firstname.lastname}@hpi.uni-potsdam.de
ABSTRACT
Keywords
Object persistence is an important aspect of application
architectures and development processes. Different solutions in this field evolved over the last decades and new approaches are still subject to research. While object-oriented
databases become increasingly popular, the usage of relational databases through an object-relational mapping layer
is still one of the most widely adopted techniques. However, most object-relational frameworks require a considerable amount of mapping descriptions between object models
and relational database schemas. This additional layer has
to be maintained by developers along with the object model
itself.
In this paper, we present an approach to object-relational
mapping that utilizes the introspection and intercession features of Smalltalk to free developers from manually creating
those mapping descriptions. The presented framework analyzes the existing models and automatically deduces suitable
database schemas. Thus, it aids development processes by
neglecting the need for a separate mapping layer.
A detailed introduction of the programming interface is
followed by a description of the framework’s internal implementation details. Additionally, the performance of the
framework is evaluated through a comparison against a comparable system for the same programming environment.
Object-relational mapping, Impedance mismatch, Objectoriented design methods, Data design and management, Automatic schema creation
1.
INTRODUCTION
Maintaining application data in persistent storage spaces
is an inherent requirement of most applications. Especially
the web applications that have evolved over the past few
years need to handle steadily growing and evolving data
schemes. While this requirement obviously has an impact
on the complexity and execution speed of applications, it
also influences their development processes.
One of the main criteria for the choice of a suitable persistence strategy is project scope. Enterprise applications rely
on robustness, execution speed and scalability [3], whereas
smaller projects additionally focus on the flexibility to quickly adapt to changes in the object model [2]. Thus, development teams need a persistence solution that does not impede
their development process, but allows them to implement
new features in a simple and straightforward manner.
In addition to project scope, decisions regarding the development environment and language also influence the choice
between available persistence strategies. Especially dynamically-typed languages like Smalltalk vastly reduce turnaround and implementation times by offering a programming paradigm that embraces change of existing implementations [29] and strong meta-programming and reflective features. The latter, however, impose non-trivial challenges for
the implementation of persistence management systems.
Today many persistence strategies are available [5, 11, 18,
24, 28]. Their underlying data storage technologies cover
a wide spectrum, ranging from purely relational databases
over relational databases enriched with object-oriented techniques, to completely object-oriented implementations. The
ease-of-integration of those solutions into dynamic objectoriented applications differs strongly [15] as the mismatch
between the paradigms founding the application development and the persistence framework varies in its extent [2].
A widely adopted solution within this field is the usage of
relational databases along with an object-relational mapping (O/R mapping) layer that bridges the gap between
an application’s object model and the relational schema of
the underlying database [1]. Generic O/R mapping frameworks cover a variety of aspects reaching from basic CRUD1
functionality to more elaborate features like transaction pro-
Categories and Subject Descriptors
D.2.2 [Software Engineering]: Design Tools and Techniques—Object-oriented design methods; H.3.4 [Information
Storage and Retrieval]: Systems and software—performance evaluation
General Terms
Design, Experimentation, Performance
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
IWST’09 August 31, 2009, Brest, France.
Copyright 2009 ACM 978-1-60558-899-5 ...$10.00.
1
87
Create, Read, Update, Delete
Visual Paradigm for UML Community Edition [not for commercial use]
User
-email : string
-username : string
-password : string
cessing. However, most available systems require extensive
meta-description of the object model in order to be able to
perform the aforementioned tasks.
Such descriptions impose a considerable burden on application development. Each change of the object model implies an alteration of the description layer [21, 22]. Seamless
integration of O/R mapping frameworks into applications is
moreover influenced by the degree of intrusiveness into the
existing object and programming model. A high degree of
transparency of the underlying database structures and systems is desirable [20]. Still, existing implementations vastly
differ in the extent of implementation detail exposure to the
user. This includes query APIs that are not integrated into
the chosen programming language as well as the need to alter inheritance hierarchies or even object layouts in order to
store objects in relational databases.
Within this paper, we present a framework that uses the
strong introspection and intercession capabilities of Smalltalk in order to free developers from the creation of extensive object model meta-description. Based on the objects created during application runtime the framework automatically deduces suitable database schemas that are also
adopted whenever developers change their object models.
The entire framework thereby remains non-intrusive in a
sense that no changes to existing object models have to
be performed and queries on the data space can be carried out by using the well know semantics of the Smalltalk
collection protocol. By that, the system combines the technique of object-relational mapping with Smalltalk development paradigms and provides an object-oriented database
like behavior within a relational-database access layer.
Squeak2 , an open-source Smalltalk dialect, has been chosen as the development environment for the implementation
of the framework due to its focus on educational purposes
and the availability on a variety of platforms. The proposed
framework is named SqueakSave3 .
The first part of the paper presents the workflow of the
integration of SqueakSave into an application. Following
those usage descriptions, the architecture of the framework
is discussed with a focus on implementation details of the
main workflows. After the performance of the system is
compared against a popular O/R mapping solution available
for Squeak, the paper concludes with remarks about related
work within the field of O/R mapping in dynamically-typed
object oriented environments and an outlook about future
extensions that could further improve the usability and performance of the framework.
2.
+blog 1
Blog
-title : string
-lastUpdate : dateTime
+administeredBlogs 0..*
0..*
1
+followers
0..*
+blogPosts
BlogPost
-title : string
-text : string
1
Author
1
Admin
1..*
1
0..*
+comments
Comment
-author : string
-title : string
-text : string
Figure 1: Class Structure of the Example Application.
SqsConfig subclass: #BlogExampleSqsConfig
instanceVariableNames: ’’
classVariableNames: ’’
poolDictionaries: ’’
category: ’BlogExample’
BlogExampleSqsConfig class>>#connectionSpecification
↑ SqsMySQLConnectionSpecification
user: ’admin’
password: ’password’
database: ’blog example db’
Listing 1: Configuration Set-Up.
and to-one or to-many associations. While the current section presents the integration of SqueakSave into the weblog
application, the mapping of those structural details is the
topic of Section 3.
2.1
Basic Persistence Mechanisms
A main requirement for SqueakSave is to provide straightforward persistence mechanisms in a very simple manner.
Below, we present the steps that are required in order to
set-up and use the framework for most basic purposes. This
includes means to store objects within the chosen RDBMS
and query for objects based on certain attribute values.
Initial Setup and Configuration.
For each class of objects that need to be persisted, developers have to set-up an instance of SqsConfiguration .
Configuration objects include numerous properties that determine the behavior of the framework for the classes they
apply to.
In order to register a configuration for the application
classes, it is necessary to create a subclass of SqsConfig .
The name of this subclass has to follow specific conventions
to be recognized by the framework as being valid for a certain set of classes. To create a configuration for the entire
application, the first part of the class category, which is normally subdivided by ‘-’ characters [4], has to be the first part
of the class name followed by the suffix SqsConfig.
In the simple use case of the blog example, only the classside method connectionSpecification has to be implemented to return valid server access credentials. It determines which RDBMS is used as target storage for the respective objects. For each supported system, the framework
SQUEAKSAVE
In the following, an introduction to the basic usage patterns of the SqueakSave O/R mapper is provided. A simple
weblog example application accompanies the description in
order to ease the understanding of basic features as well as
more elaborated techniques, such as transactions or custom
mapping descriptions.
The class structure of the sample application is depicted
in the UML class diagram [23] in Figure 1. It exhibits the
most common structural challenges that O/R mappers have
to handle within applications [13]: inheritance relationships
2
http://www.squeak.org
http://www.hpi-web.de/swa/squeaksource/
SqueakSave.html
3
88
provides a specialized SqsConnectionSpecification subclass. It provides standard values for port and hostname of
common RDMBS server implementations such as MySQL or
PostgreSQL. The only mandatory data are username, password, and the name of the target database. It is important
that the user account provided for accessing the database has
the privileges to create, alter, and drop tables, since SqueakSave constantly reorganizes the table structure according to
changes within the application classes. The complete configuration class for the example configuration is depicted in
Listing 1.
Following the aforementioned naming conventions, it is
also possible to create different configurations for sub-categories of the application by extending the category specific
part of the class name prefix.
If the configuration itself has to be altered, it is possible to
re-implement the configuration method on the class side
of the configuration class. Additionally, the configuration
method can be implemented on the class side of each application class, thereby providing the most fine-grained way of
setting up configurations.
While it would be more compliant with object-oriented,
and especially Smalltalk, principles to directly connect the
class category with its configuration [17], this is not possible within Squeak, since the category is only identified as a
string and not accessible as a first class object.
author := Author new
password: ’password’;
username: ’testuser ’;
email: ’[email protected]’.
author blog:
(Blog new title: ’My Blog’).
author save.
Listing 2: Basic Object Storage.
(SqsSearch for: User) detect: [:aUser |
aUser username = ’testuser’]
(SqsSearch for: Author) select: [:anAuthor |
anAuthor blog blogPosts size > 10 ]
(SqsSearch for: Blog) anySatisfy: [:aBlog |
aBlog blogPosts noneSatisfy: [:aBlogPost |
aBlogPost comments isEmpty ] ]
Listing 3: Query Examples - Emulated Collection
Protocol.
Persisting Objects.
Convention-based setup of configuration classes is essential to enable simple storing of objects. By extending the
Object class, methods have been introduced that implement
the data-modifying CRUD operations: creating, updating,
and deleting objects. As a consequence of this ‘monkeypatching’4 any object, whose class is a subclass of Object ,
within the application can be stored and updated by sending
it the save message. Since no database session or connection
specification is passed as a parameter, this method relies on
the previously set-up configuration objects and will trigger
an exception if no configuration is available for the corresponding class.
Listing 2 presents the creation of an author object along
with the associated blog. The save method will store the
author object itself and the blog within the database and
also create the one-to-one relationship between them.
Removing objects from persistent storage is possible by
using the destroy method. It will remove the database rows
corresponding to an object, and all references from other
database tables to that object. Accordingly, destroying a
user object within the sample application will also lead to
a removal of the user from each followers collection it has
been part of. While the database entries will be removed by
the framework, the object itself remains unchanged.
that standard language constructs can be used is an important feature with regards to the usability of an O/R mapper
[9]. SqueakSave provides a query interface that does not
rely on string-based query encoding, but instead emulates
the Smalltalk collection protocol [8].
Object queries are usually sent to instances of SqsSearch .
These objects must be initialized with a class; instances of
this class and its subclasses will be returned by the query.
Queries can be performed on each class residing within an
image; however, a valid configuration for this class must be
available.
Within the sample application, this behavior can be utilized to distinguish between authors and administrators. If
searches are performed on the User class, they will return
instances of Admin as well as Author . Performing searches
on either of those classes individually, however, will only
return their particular instances.
Listing 3 presents example queries that could be used
within the blog example application. The first query performs a search for the user with the username ‘testuser’.
According to the Smalltalk collection protocol, the detect
method will only return the first user that is found within
the database and trigger an exception if no such entry exists.
Query number two uses the aforementioned mechanism to
narrow the set of possible search results down to special subclasses. The presented select method will find all authors
that have a blog with more than ten blog posts.
The last query determines whether any object within a
collection fulfills a given constraint. In this particular case
the query will only return true if at least one blog exists
where all blog posts have been commented at least once.
The messages sent to the query objects, such as aBlog or
aUser are limited to accessor methods that are named exactly like the corresponding instance variables. Subsequent
method invocations on the return values, such as collections,
Object Query Interface.
In addition to the modifying CRUD operations, a persistence framework has to offer means to perform queries on
the persistent space. Since SqueakSave is built upon a relational database foundation, those queries have to be carried
out as SQL statements. Integrating queries in such a way
4
http://en.wikipedia.org/wiki/Monkey_patch
89
Blog findByTitle: ’testblog’
sessionManager := SqsConnectionManager getInstance.
session := sessionManager
sessionForClass: Blog.
session := sessionManager
sessionForCategory: ’BlogExample’.
session := sessionManager
sessionForConfiguration: aCustomConfiguration.
Comment findByAuthor: ’author’ andTitle: ’comment’.
Listing 5: Query Examples - Convention-Based Dynamic Finders on Classes.
Listing 6: Possible Ways to Retrieve Session Objects.
integers, or strings must be implemented within the respective classes of the SqueakSave framework (see Section 3.5).
In addition to the collection protocol emulation, SqueakSave offers convention-based dynamic query methods similar
to those in other dynamic-language object-relational mappers such as GORM [28] for Grails5 or ActiveRecord for
Ruby on Rails [12].
schema as well as architecture patterns that are used for
the mapping of object-oriented structures to relational constructs.
Specialized configurations for subcategories and single classes are possible by implementing a configuration method
in the respective configuration classes. The configuration object is available within those methods by calling super configuration. Attributes of objects referring to field names
can be changed, e. g., to adhere to naming conventions of
other O/R mappers, or to solve naming conflicts.
Altering the configuration can also be used to fine-tune
framework behavior. It is possible to define whether instance
variable accessor methods or object introspection mechanisms should be used to access instance variable values by
setting useInstVarAccessor to either true or false.
While the framework by default alters table structures
and association types only after developers confirmed those
changes, the warnOnAlteration attribute can be set-up to
disable the according warning dialogs.
When the object model is finalized and mapping update
functionality is no longer required, the introspection behavior should be disabled in order to improve the overall performance of basic persistence operations. The environment
attribute of the configuration can therefore be set to the
value ‘#production’ instead of its default value ‘#development’.
(SqsSearch for: Blog) findByTitle: ’testblog’
(SqsSearch for: Comment)
findByAuthor: ’author’ andTitle: ’comment’.
Listing 4: Query Examples - Convention-Based Dynamic Finders.
The first query presented in Listing 4 depicts a simple usecase where instances of the Blog class have to be found by
an exact match between the given argument and the current
value of the title instance variable. The second search is an
example for the concatenation of constraints. Concatenation
keywords (i.e. ‘and’) adhere to SQL terminology. Thus, ‘or’
can be used as well within dynamic finders.
The aforementioned object-relational mappers allow for
calling the dynamic finder methods directly on a class. In order to achieve the same behavior in Squeak, it would be necessary to either overwrite the doesNotUnderstand method
within Class , or provide a means for application developers to integrate this implementation only within their model
classes. This fine-grained integration could be achieved by
providing an abstract base class that application classes have
to inherit from. However, this kind of intrusion into the inheritance structure would not comply with the requirement
to provide persistence as an aspect added to the application instead of being an integral part of it. A less intrusive
technique is the usage of traits. They have been introduced
in the Self programming language [30], and later been applied to Squeak [10] to provide a more fine-grained mechanism for reusing existing implementation details. By adding
the TSqsSearch trait to any application object model class,
queries can be performed as depicted in Listing 5.
2.2
Session Usage.
While the implementation of SqueakSave frees users from
the need to utilize an explicit session object to store, retrieve, and delete objects, some more advanced functionality is available only by using instances of SqsSession . Session objects can be retrieved from the singleton instance
of the SqsConnectionManager . It caches the sessions on a
per-thread basis. Thus, requesting a session for a certain
configuration, class, or category will always return the same
object within a single thread of control. The different possibilities to get the current session for the sample application
are depicted in Listing 6.
With the session object, it is possible to perform transactions and define the intended behavior upon transaction
failures. If the SqueakSave session is, for example, stored
within a Seaside6 session object, and all data manipulation
operations are performed by passing the session as an explicit parameter, transactions can even span the entire life
cycle of web application usage by a single user. Transactions
do not have to be performed by defining a block-closure for
the transactional behavior and one for the rollback case, but
Customization
Utilization of the presented techniques to store and query
for objects is sufficient to perform basic CRUD operations
on application data. However, extensions are required for
customizing the O/R mapping framework behavior, and for
optimizing aspects of performance and robustness.
Custom Configuration.
The configuration object includes properties that define
standard values for certain fields of the resulting database
5
6
http://www.grails.org
90
http://www.seaside.st
transactionalBlock := [
testuser email: ’[email protected]’.
testuser save: session.
testuser password: ’newPassword’.
testuser save: session.
].
AccountData class>>#sqsDescrUsername
↑ SqsColumn new
manuallyMaintained: true;
columnName: ’name’;
sqlType: #varchar:20;
linkedAttribute: #username.
session
inTransactionDo: transactionalBlock
ifError: [ testuser rollback ].
Listing 9: Custom Mapping Description.
"alternatively"
session startTransaction.
transactionalBlock value.
session commitTransactionIfError: [
testuser rollback ].
cludes the blog of the user into to storing process. With
saveToLevel:2 the blog post is considered, since two references have to be followed from the user to those objects.
The final call of deepSave stores every object reachable from
the user object and only stops upon cyclic dependencies or
if no further references are detected.
Listing 7: Transactions within Sessions.
Custom O/R Mapping Descriptions.
newBlog :=
title:
newPost :=
title:
newComment
title:
While SqueakSave mostly hides the creation and handling
of O/R mapping descriptions, they are not only kept in
memory during persistence operations but are also stored
within the image for later usage. The format of this persistence is defined by the chosen description handler class.
This can be altered within the configuration object itself.
The standard description handlers utilize the internal format of the meta-descriptions and simply serialize the corresponding objects. However, custom mapping descriptions,
such as pragmas or XML documents can be generated as
well, if the corresponding description handler classes have
been implemented. Due to this fact, the techniques to mark
descriptions, or parts of it, as being manually maintained,
differ between the description handler implementations.
Regarding the standard description handler, each description includes a manuallyMaintained flag that indicates whether it is maintained by users or not. If this flag is set,
automatic updates will not alter the particular description.
However, if the custom description requires changes to the
database schema, they will be carried out by the framework.
A variety of options can be altered within the mapping
description for particular instance variables. This includes
trivial values, such as the column name or the SQL type of
the column, but also more advanced features like foreign-key
constraints. Additionally, it is possible to alter the name of
the table, that is created for each class. Listing 9 depicts a
custom configuration for the username field of the account
data.
Blog new;
’New Blog’.
BlogPost new;
’New BlogPost’.
:= Comment new
’New Comment’.
newPost comments add: newComment.
newBlog comments add: newPost.
testuser blog: newBlog.
testuser
testuser
testuser
testuser
flatSave.
save.
saveToLevel: 2.
deepSave.
Listing 8: Different Save Levels of SqueakSave.
it is possible to explicitly start and commit them via the respective methods of the session protocol.
Listing 7 depicts the two possibilities by using an explicit
session object that has been retrieved like shown in Listing
6. The rollback method will set the instance variable of
the user object back to the pre-transaction state.
Performance Optimization.
The database schemas created by SqueakSave follow the
basic patterns described by Fowler et. al [13] - single, concrete, and class table inheritance. However, not all of those
patterns may be suitable for each object model. Especially
deep inheritance hierarchies can create performance problems, if they are mapped to a single table. Additionally, an
abstract base class for all application classes should be ignored for persistence purposes, since each subclass instance
has to be saved within the base class table, as well (class table inheritance), or all application objects will reside within
the same table (single table inheritance).
SqueakSave also offers means to control the object graph
traversal depth required to store or update objects. Within
the example that is presented in Listing 8, the consecutive
usage of the different methods that enable this behavior will
gradually store more associated objects of the user object.
While flatSave only stores direct attributes, save also in-
2.3
Summary
The preceding presentation of the usage workflow of SqueakSave has demonstrated, that the requirements regarding simplicity of usage as well as customizability as a means to
increase interoperability, have been fulfilled. It becomes apparent that only minimal configuration is necessary, in order
to add persistence in a very transparent manner to an existing application. While the API of SqueakSave may not
comply with every other available solution, and thus changes
to the source code might have to be carried out, this does
not necessarily decrease the ease-of-integration. It is generally advised to encapsulate database access functionality
in a separate layer between the application and the persistence framework. Within this layer the presented CRUDfunctionality can be implemented in a very intuitive man91
Visual Paradigm for UML Community Edition [not for commercial use]
Object
0..*
storedObject
1
class
1
Class
currentClass
1
stance variable name and the types are pre-defined within
sqsType methods on the class side of the respective classes.
This methods return a SqueakSave internal string representation of the according SQL type. For types with variable
length the mappings are additionally enriched with the information about the current length of the respective object.
Information about complex attributes—objects that cannot be mapped to simple SQL types but require a separate
table structure—is additionally tagged with the class of the
respective object as well as a generic description of a foreignkey relation to the database table for that particular class.
For attributes holding collections of objects, the type of the
collection, the class of the collection index, and the class of
the included elements have to be determined.
All this information is persisted in the format specified by
the corresponding description handler. Upon every save of
an object the description handler has to determine whether
changes to the relational structure would be necessary by
examining each instance variable for differences compared
to the previous version of the description.
Alterations can become unavoidable in a variety of scenarios. Most obviously that is the case if the class of an
assigned value has changed. However, not every object class
change requires a database structure change. Certain types
comply with each other with regards to their database representation. Within the example application, this behavior
could be observed if an Admin object is the current value of
an attribute that was previously pointing to general User
objects. For collections, it is also necessary to determine
whether the type of the collection itself has changed since
indexable collections like an OrderedCollection or a Dictionary would require the storage of the index, whereas a
Set , for example, would not require such a field.
Depending on the specified configuration, the framework
issues a warning dialog before changing the descriptions. If
developers decide to not allow the requested changes, the
storing procedure is aborted.
1
instVarValue
SqsBase
SqsStorage
SqsConnection
0..*
SqsProxy
SqsConnectionManager
1
0..*
1
1
+classInfo
1
0..*
+session
1
descriptionHandler
SqsClassInfo
1
SqsSession
1
<<use>>
dbAdapter
1
SqsDatabaseAdapter
SqsDescriptionHandler
<<use>>
1
tableStructureHandler
SqsTableStructureHandler
0..*
0..1
connection
SqsDatabaseConnection
1
Figure 2: Overview of SqueakSave System Classes
ner.
3.
FRAMEWORK ARCHITECTURE
The usage workflow described in the preceding chapter is
realized by the core classes of the SqueakSave framework.
They are depicted in a simplified manner in Figure 2, i.e.,
without the inclusion of concrete subclass implementations.
3.1
Storage Wrapper Class
Enriching objects with capabilities that have not been implemented within their respective class definitions can be
realized by utilizing a number of standard patterns. As existing class definitions shall not be altered, the SqueakSave
framework relies on the SqsStorage class as a decorator [14]
that handles persistence-related operations such as storing,
updating, or deleting objects.
Accordingly, calls of save or destroy will be internally
delegated to an instance of SqsStorage instead of being handled completely by the target objects themselves. For each
object that is present within the image, a unique SqsStorage
instance is created on demand. Due to a caching mechanism
that is utilizing weak references [16], the respective instances
are only available as long as the base object is not subject
to garbage collection.
In addition to the decorator, the framework will also assign a unique object id to each persisted object. Those
unique identifiers, that are usually generated by the respective RDBMS, are required to couple an object to its database
representation and, accordingly, enable references between
objects on the database level [1]. The ids are stored as an
instance variable of the decorators within the image and in
a primary key column within the database.
The decorator is connected to the current database session
and by that has access to the corresponding configuration
for the decorated object. The configuration determines the
classes of the descriptionHandler and tableStructureHandler instance variable variables.
3.2
3.3
Table Structure Adaption
After the mapping descriptions have been updated, the
SqsStorage decorator passes control to the table structure
handler. It translates the general attribute descriptions to
representations of actual relational constructs (tables, columns, or constraints) and thus builds a generically traversable abstraction from the actual table structure. Each such
table object can have a number of columns, foreign key constraints, and child tables. In addition, each child table also
includes a reference to its parent table.
In a straightforward case, however, the structures created
from the descriptions are rather simple. Depending on the
inheritance mode specified within the configuration, all attributes reside within the same table (single table inheritance), or a separate child table is created for each subclass
(class table inheritance). Within those tables, a column with
the previously determined SQL type is created for each simple attribute. For complex attributes, the handler will also
create a foreign-key constraint that guarantees the referential integrity of the reference to the table of associated objects.
O/R Mappings: Creation and Update
The description handler is responsible for creating mappings between objects and their database representations. It
does not create the underlying database schema but analyzes
the given objects using introspection and creates detailed
descriptions for the current values of an object’s attributes.
For most basic data types, such as strings or integers, the
mapping to relational constructs is straightforward. The
suggested column names are simply deduced from the in-
Collection Mapping.
Collections of objects are always created as join tables,
and not like in other O/R mappers in case of one-to-many
92
relations as foreign keys within the table of the referenced
objects. This is a direct consequence of two problems. The
first one is the distinction between one-to-many and manyto-many relations through reflection. While it would be possible to detect those relations, implementing this feature has
proven itself to be too time consuming during program execution. Not only would the framework be supposed to follow
all references pointing to objects within a collection, until
one is found that has more than one reference to it. But,
additionally, database queries would be required to check
if references exist that are not currently present within the
application’s object memory.
The second problem is the inversion of the logical association direction from the object model to the relational
structure [21]. Instead of the collection owner pointing to
the values of the collection, elements within that collection
would reference their owner. This fact is also problematic
regarding object usage within many collections in different
classes or instance variables of the same class. It would be
required to add a new table column for every reference to
those objects.
The created join tables contain a field referencing the table
entry of the collection owner and another column pointing to
the respective object within the collection. Additionally, an
order field is introduced if the application uses ordered collections. This field is created with the type of the index value
of the collection. To map an Array , for example, the index
field would be of type INTEGER, while a string-indexed dictionary would require a VARCHAR type. If the collection
only includes simple values, the reference field to collection
elements will be replaced with a field of the respective type
that directly stores them within the join table.
processed in the present operation. A flag is set upon first
traversal, and if cyclic references lead to an object again,
only changes to instance variables and owned collections will
be examined.
Decorators also create a simple representation of the state
of the decorated object upon each save call. This so-called
instance variable value map enables the framework to quickly
determine whether an object has changed at all and if so,
which variables have changed. Unchanged variables will be
ignored during mapping description updates and also not be
part of the ‘UPDATE’ statement issued on the database.
Database Connection Handling.
Database adapters encapsulate SQL query generation according to the specifications of the respective RDBMS. To
execute those queries, adapters rely on SqsDatabaseConnection instances. These conceal differences between the
connection objects supplied by the different database access
drivers.
The physical database connection is obtained by the database adapters only when required, and dropped whenever
queries have been executed successfully. While connecting
and disconnecting to the server upon each request would
have simplified the implementation, it is not a viable approach with regards to performance. Login procedures on
database servers are rather costly in comparison to execution times of smaller queries. Therefore, SqueakSave implements a centralized connection pool. This pool is maintained by the singleton SqsConnectionManager , and due
to a SharedQueue implementation also thread safe. Each
adapter that requires a database connection has to utilize
the connection manager and either get it instantly, or whenever a connection is returned to the queue by another adapter. The shared queue guards the insertion and retrieval processes. Hence, it is guaranteed that each connection is only
assigned to one adapter at a time. All adapters that have
to wait for a connection are also waiting for the semaphore
to become available and, accordingly, race conditions are
prevented in this scenario, too.
While this standard behavior is suitable for most basic
operations, it obviously cannot be used during transactions.
Therefore, each database adapter is aware of its current
transaction state and does not return connections to the
queue while a transaction is in progress.
Structure Updating.
If the table structure already has been created, the table structure handler compares a cached version of the class
table with the one created from current descriptions. The
SqsTableChanges class is capable of comparing two tables
and extract all columns, whose names or types have been altered. Additionally, it detects added and removed columns
and foreign key constraints. All required changes are subsequently carried out on the database.
Since this process is highly sensitive to interference with
similar operations carried out by other processes, a semaphore guards the entire structure update and creation workflow. While this might diminish the overall system performance, it is necessary to keep the cached table structures
and, accordingly, the database schema in a consistent state.
Finally, after the table structure has been altered to the
required schema, the description handler inserts the values
into the corresponding tables.
3.4
3.5
Query Generation
The following section provides a detailed explanation of
the SQL query generation from method invocations on the
language-native query API.
Collection Protocol Emulation.
The implementation of the collection protocol emulation
for object queries is based on the work of W. Harford and E.
Hochmeister, who have implemented a quite similar system
for the ReServe project7 . While the basic implementation
allowed for simple queries on directly associated attributes
of objects, it has been enriched with the capabilities to define query conditions on associated collections and directly
associated objects to a much deeper level within the object
graph structure.
In order to analyze the block-closures that are passed
Supporting Workflows
The previously described procedures are sufficient for the
basic implementation of O/R mapping and table structure
creation and updates as well as insertion of the actual values
into the database. However, more elaborated workflows are
required to improve the mapper’s performance or handle
special circumstances, such as cyclic dependencies.
By tightly coupling decorator instances to decorated objects, it is possible to cope with recursive calls of the save
method. Decorator instances will only try to store associated objects if the current object has not already been
7
93
http://www.squeaksource.com/REServe.html
Visual Paradigm for UML Community Edition [not for commercial use]
SqsQuery
-queryClass : Class
-whereBuffer : String
-orderBy : String
-ignoreTypeField : Boolean
-distinct : Boolean
ProtoObject
queryTables
1..*
valueTables
1
1..*
SqsQueryCollection
<<use>>
SqsQueryDate
SqsQueryNumber
SqsQueryTable
-aliasSuffix : string
-field : string
-queryClass : Class
-originalTable : SqsTable
[:aComment | (aComment author = ’author’) &
(aComment title = ’comment’)].
1
1
previousQueryValue
0..1
SqsQueryValue
-whereBuffer : String
-depictedClass : Class
-referencedColumn : SqsPersistenceDescription
currentQueryValue
1
1
1
1
toTable
0..*
1
SqsTableLink
-joinDirection : string
-fromFields : Collection
-toFields : Collection
tableLinks
SqsQueryObject
Listing 12: Block-Closure Generated from DynamicFinder Method.
SqsQueryString
explicit distinctions between the different table models.
SqsQueryDateTime
Convention-Based Query Methods.
The implementation of the convention-based dynamic query methods is also based on the collection protocol emulation. Therefore, the finder methods are analyzed for the occurrence of attribute names and the respective values. This
is performed within a re-implementation of the doesNotUnderstand method that handles calls of undefined methods
on objects. The method checks whether the first part of
the selector either matches find or findAll. If either of those
strings matches the beginning of the given method selector, the remaining parts are scrutinized for their compliance
with instance variable names of the respective search class.
Finally, the algorithm determines the logical operators that
are implied by the method name.
Afterwards the framework creates block-closures depicting those constraints and concatenates them with the chosen
logical operators. The block-closures are generated by utilizing the previously extracted strings from the method selector name and the arguments passed to the dynamic finder
method. The values are especially important in this case,
since they have to be translated into a string. Complex objects, for example, require the inclusion of their object id into
the query string, while simple types such as dates or strings
need to be escaped to be properly parsed by the Squeak compiler. Therefore, the SqsSearch class maintains a dictionary
with the respective methods, it has to call for certain types
of objects. If the string representation has been successfully generated, it is passed to the Compiler that generates
executable bytecode for the required block-closure.
This block-closures will be then forwarded to an instance
of the SqsQuery class, that analyzes them as described previously. Listing 12 depicts the block-closure created from
the second dynamic finder method presented in Listing 5.
Figure 3: Collection Protocol Emulation Classes
as arguments to the respective collection methods, SqueakSave utilizes the SqsQueryValue classes depicted in Figure
3. Each of those classes imitates the protocol of basic system
classes such as Integer or String . But instead of delivering the result for each operation, the methods gradually fill
the whereBuffer attribute with the SQL equivalents of the
respective operations. Listing 11 presents the SQL WHERE
statement that is generated for a sample query (Listing 10).
(SqsQuery on: BlogPost) analyze: [:aBlogPost |
aBlogPost text size > 100].
Listing 10: Language-Native Query Before Translation
‘WHERE CHAR_LENGTH(blog_posts.text) > 100‘
Listing 11: Generated SQL WHERE Statement
Complex objects, that cannot be directly mapped to an
SQL type are depicted by instances of SqsQueryObject .
Each method sent to those objects is analyzed with regards
to the database columns representing the corresponding attribute. If such a column exists, the where buffer is enriched
with a unique identifier consisting of the according table
and column name. If columns refer to rows in different tables (i.e., foreign key relations), this scoping is performed by
SqsQueryObject s, too. Upon each scoping to another table,
the table names are being aliased with a unique suffix, that
allows for self-referencing foreign key handling.
In addition to the WHERE statement creation, the system also conglomerates the tables that are important to the
query within SqsQueryTable objects. They include a unique
suffix and a reference to the SqsTable object, that serves as
a meta-description of the database table structure. Additionally, a number of links to other tables can be added
to a query table, in order to represent joins that have to
be performed for queries. During the final steps of query
generation, those query tables are connected to form the
FROM part of the SQL query. Tables, whose values have
to be returned from a query, are stored in the valueTables
collection of an SqsQueryObject .
This generic analysis of block-closures allows the framework to handle table structures for class and single table inheritance and the nesting of constraints, e.g., for sub queries
on collections that are owned by query objects, without any
Object Proxies.
For performance and framework internal reasons, instances of SqsProxy are inserted into query results instead of
directly associated complex objects or collections. There are
dedicated proxies for directly associated objects and those
representing collections.
Proxies for directly associated objects like a user’s blog in
the sample application are necessary to avoid an eager loading of the entire object graph upon the creation of query
results. The proxies are initialized with all information required to trigger loading of the depicted object if the application accesses them. All calls to proxy objects, except for
those defined on ProtoObject , are delegated to the loaded
instances. Thereby, proxy insertion remains transparent to
framework users and the proxies could also be removed once
the depicted object is present within the image.
Collection handling requires a different approach to proxy
insertion. While the aforementioned objects only serve as
94
Database Adapters.
placeholders, collection proxies are essential to detect changes in collections. Therefore, before each save call and after loading an object as the result of the search query, an
instance of SqsCollectionProxy is inserted instead of the
original collection. In addition to loading all objects that
are part of the original collection, those proxies also create and maintain an internal map of the collection objects.
This allows the framework to detect added, displaced, and
removed objects in a collection. Hence, after each successful save call, the collection map will be updated, and if the
object referencing the collection is saved again, all changes
that happened up to this point will also be reflected within
the database.
An obvious extension point for an O/R mapper are adapters for different RDBMS. They implement the generation
of the SQL queries depicting certain database operations. In
order to provide a custom adapter, two steps are mandatory
for alleged extension developers.
The first one is to create a subclass of SqsConnection that
implements some basic operations to control the state of the
actual database connection and execute queries on them.
The connection control methods are required in order to automatically create new connections within the connectionpool. Therefore the init, close, and isAlive operations
have to be implemented. In addition to the query execution,
the framework also requires means to convert the query results from the client-internal format into a general one, that
can be handled by SqueakSave adapters.
While it is necessary to re-implement those methods for
each adapter facilitating a native client implementation, it
would be possible to utilize an open standard interface that
provides the same access methods, regardless of the underlying database. This includes connectors like ODBC8
or OpenDBX9 . However, the setup of those two solutions
requires not only the installation of respective clients for
Squeak, but additionally the installation or even compilation
of platform-dependent libraries within the operating system.
The methods within the protocol of SqsDatabaseAdapter
that have to be overridden in order to provide a working
adapter implementation for a certain RDBMS are rather
difficult to be determined. This is mainly a consequence of
the custom extensions to the SQL-standard implemented by
different RDBMS vendors. The basic implementation within
SqueakSave, however, strives to implement almost all operations according to the SQL standard. This should minimize
the number of methods that have to be overwritten.
Object Caches.
In addition to using caches for object id storage without
object model or inheritance structure alteration, query performance optimization also requires this feature. To avoid
rebuilding objects that already are query results, or have
been instantiated just recently, it is necessary to maintain
an additional cache. It has to return pre-built instances
identified by their class name and object id.
While caching all available objects could improve the performance of query result creation, a trade-off between the
memory footprint of the framework and the performance
gain induced by result caching has to be made. Therefore,
the cache size is limited on a per class basis to a configurable
number of entries and makes it possible to implement different cache sizes for each application.
3.6
Framework Extension
A central requirement for the development has been the
extensibility of the framework with regards to the adoption
of newly available database management systems and the
implementation of custom O/R mapping flavors. Therefore,
the classes responsible for realizing the corresponding behavior have been implemented in ways that ought to simplify
the development of custom framework extensions.
3.7
Summary
Main requirements for the implementation were the realization of automatic updates, language-native queries, and
extensibility of the framework. Above, necessary design decisions for the implementation of this behavior have been
presented. Automatic updates are implemented by a copious algorithm that covers almost all possible changes to
object models and therefore dependably and only updates
existing mapping descriptions if necessary.
Language-native queries have been implemented by creating a block-closure analysis system that can handle deep
object graph structures and standard operations on simple
data types as well as accessor methods on complex objects.
Extension points are also available for all designated components of the framework and provide meaningful presets for
the implementation of custom description and table structure handlers, as well as database adapters
Custom Object-Relational Mapping Descriptions.
The SqsDescriptionHandler serves as an abstract baseclass, that defines the methods, which are crucial to the
implementation of custom description handlers.
Only two methods have to be implemented in order to
create new mapping description handlers. sqsDescriptionFor: returns the meta-description of the O/R mapping for
an instance variable of the object that is subject of currently
performed persistence operations. While this description
can be stored in arbitrary formats, the method always has
to deliver instances of SqsPersistenceDescriptor . This
translation might be costly with regards to time consumption, but developers could avoid performance problems by
caching the SqueakSave-internal format or persisting it by
utilizing the standard description handlers.
The second method that needs to be implemented is createDescriptions. It is called during the storing process
and, since the description handlers have full access to the
decorator of the persisted object, requires no additional parameter. While it would compromise the self-configuring
nature of SqueakSave, to not create or update mapping
descriptions, custom description handlers that should only
supply reading abilities can waive this implementation.
4.
EVALUATION
The main focus of the implementation of SqueakSave is
the support of fast-evolving object models and the development of a generic architecture that allows for extension of
the available description systems, table structure handlers
and database adapters. However, performance is an important aspect of each persistence management system [2]. Ac8
9
95
http://support.microsoft.com/kb/110093
http://www.linuxnetworks.de/opendbx
Traversal1
SqueakSave
GLORP
Traversal2a
SqueakSave
GLORP
125.698
58.718
2.237
5.012
Database Creation Time
300s
cordingly, the implemented framework has to be evaluated
with regards to both aspects. The following section provides
benchmark results for SqueakSave in comparison to another
O/R mapping framework for the same development environment. Additionally, the production and development modes
are compared and conclusions are drawn regarding performance bottlenecks and possible optimizations.
4.1
225s
150s
75s
0s
Query 1
Query 2
Performance
objects. In addition to those standard parts, database creation times have been examined, as well. While the insertion
of such an highly intertwined and large object graph might
not reflect everyday usage patterns of object-relational mappers within applications, it is an indicator for alleged performance bottlenecks and optimization potentials.
The overall database size of the benchmark can be configured in four orders of magnitude. Each of them increases
the amount of stored objects and connections between them.
The third-largest version of the benchmark was used, since
it reflects the intended application area for the SqueakSave
framework in terms of database usage. It includes approximately 10.000 atomic parts with 30.000 connections and
thus reflects the database payload of small to mid-sized applications.
Figure 4 presents the overall creation time for the database
schema that is required to perform the OO7 Benchmark.
It is evident that GLORP outperforms SqueakSave by far.
This is mostly a consequence of the ability to delay the insertion of objects into the database and perform them at a
later point in a bulk operation. Thereby, instead of numerous single queries, only a few large ones are carried out and,
accordingly, the overall execution time decreases. While
this technique obviously could improve the performance of
SqueakSave within such insertions, the decision to only provide direct save methods has been made with regards to API
simplicity and not execution speed.
Comparison with other Object-Relational
Mappers
Since platform specific limitations and performance bottlenecks, such as overall inferior execution speed or subpar
implementations of viable system classes, impede objective
measurements, a meaningful comparison can only be performed against a comparable system implemented within
Squeak: The generic lightweight object-relational persistence
framework (GLORP) [18].
In addition to pure performance comparisons of aspects
like object creation after queries, it is also interesting to see
how the different implementation paradigms of GLORP and
SqueakSave compare to each other. SqueakSave requires
explicit save operations to store or update objects, while
GLORP is transaction based. Accordingly, the transaction
based frameworks are able to accumulate all operations on
the data and perform them, if possible, in bulk SQL statements. The benchmarks will identify scenarios where this
behavior is beneficial with regards to performance.
The PostgreSql Client 1.0 was used in a Squeak 3.10 image
running on the Squeak VM version 3.8.18. SqueakSave was
used in revision 107, and GLORP in version 0.4.169. To further avoid influences on the measured timings, both systems
were set-up to their respective production environment, i.e.,
SQL statement logging and other debugging features have
been disabled.
The benchmark consists of two parts. The first one performs a number of plain search queries on the created object
space and measures the timings for each of them. The second part traverses object hierarchies from distinctive starting points and performs some alterations of the respective
10
GLORP
Figure 4: Benchmark Database Creation Times
Numerous benchmarks exist to measure the performance
of object persistence technologies. The BUCKY [7] or the
BORD benchmark [19], for example, are especially designed
to analyze the performance of object-relational systems. Different approaches, like the OO7 Benchmark [6], have been
developed to provide objective measurements for any kind
of object persistence, without any special focus.
One of the requirements for the implementation of SqueakSave is to provide persistence in a transparent manner. Thus,
the OO7 Benchmark is utilized for performance measurements. The implementation used for this comparison is
based on the Java version10 of the original benchmark, which
was written in C. It was ported to Java to compare the performance of object-relational mappers and object-oriented
databases [31].
Measurements have been carried out on a 2.4 GHz Intel Core 2 Duo Macbook with 4GB RAM and Mac OS X
10.5.6. PostgreSql version 8.3 has been used as the underlying RDBMS. Each benchmark was run 100 times; measurement results represent the median of all retrieved timings.
4.2
SqueakSave
Query Performance.
The queries performed during the OO7 benchmark continuously increase in terms of complexity and result count.
A description of the query contents is available in the paper that describes the original benchmark, as well as in the
comparison carried out by Zyl et. al.
Query times presented in Figure 5 show that, regarding
query performance, GLORP is generally faster than SqueakSave. The large difference in the first query, however, is not
a result of superior query performance, but a consequence
of optimistic caching. Instead of performing the query on
the database, results are delivered directly from the cache.
While this obviously increases query performance, it is also
error-prone. Had the respective object been removed from
the database in another session, the query would return an
object that no longer exists in persisted space.
In all queries, except for the aforementioned one, differences between SqueakSave and GLORP are in a range of
about 10–20%. The slight advantage in query four is a consequence of more efficient join table handling, since the generated SQL statements are almost equal, except for some
minor differences in created table and column alias names.
Unfortunately, the benchmarks reveal the tendency of an
increasing distance between the two frameworks for expanding result sets. In queries seven and eight, the previous gap
becomes vastly larger.
http://sourceforge.net/projects/oo7
96
Query 2
Query 1
40ms
48ms
30ms
36ms
6s
98s
5s
65s
3s
33s
2s
24ms
20ms
12ms
10ms
0s
0ms
Traversal 2a
Traversal 1
130s
SqueakSave
0ms
GLORP
SqueakSave
Query 3
GLORP
700ms
GLORP
Traversal 2c
27s
24s
20s
18s
14s
12s
350ms
180ms
7s
6s
175ms
90ms
0s
SqueakSave
0ms
GLORP
SqueakSave
Query 5
5,000ms
60ms
3,750ms
40ms
2,500ms
20ms
1,250ms
SqueakSave
0ms
GLORP
SqueakSave
0s
SqueakSave
GLORP
in advance by a single one. The subsequent traversals, on the
other hand, show that the huge disadvantage of SqueakSave
turns around completely. This is a consequence of SqueakSave’s caching mechanism, that gradually fills the central
object cache during the first traversal. Hence, the entire
object graph resides in memory for the second run. While
the performance obviously improves because of that mechanism, the same coherence problem mentioned with regards
to GLORP’s first query result apply here.
The traversal times in the following tests obviously increase since the sub elements of the model are not only being traversed, but also updated. Therefore, it was expected
that the advantage of SqueakSave slightly diminishes. However, the traversal times in those tests still show, that for
the traversal of previously loaded object graphs SqueakSave
seems to be a more efficient solution than GLORP.
The results have shown that SqueakSave, despite its automated mapping features can compete with existing O/R
mapping solutions in terms of query and traversal performance. Especially, the caching mechanism makes SqueakSave a viable solution for sequential object graph traversals.
The slow insertion times within large data-sets could be diminished by implementing a technique similar to the one
introduced by GLORP. Special attention in future versions
of the implementation has to be paid to the handling of large
result sets, since they obviously impact the performance in
a more than linear manner.
GLORP
5,000ms
3,750ms
2,500ms
1,250ms
SqueakSave
GLORP
Figure 6: Benchmark Traversal Times
Query 8
0ms
SqueakSave
GLORP
Query 7
80ms
0ms
SqueakSave
525ms
270ms
0ms
0s
Traversal 2b
Query 4
360ms
SqueakSave
GLORP
GLORP
Figure 5: Benchmark Query Times
Concluding the query performance review, it can be stated
that SqueakSave still has potential for optimization. While
the difference for small result sets is minor and might be
improved by smarter caching mechanisms, handling large
result sets still remains an issue.
Traversal Performance.
The chosen traversal measurements of the OO7 benchmark all follow the same pattern. They start at the generated modules and navigate from the design root down to
the atomic parts. With each traversal the depth of navigation through the object graph increases and, additionally,
the last two also alter some data within the atomic parts.
Traversal 2c not only changes those values once, but three
times.
The other available traversals have been omitted, since
they iterate through all characters of document texts and
accordingly do not provide any insights into traversal speed,
but only string operation performance.
Traversal benchmarks have been run independently from
previous database creation and query tests. Those would
have lead to extensive caching of the object graph and,
therefore, could not reveal deficiencies within the loading
of associated objects. For subsequent traversals, however,
object caches have not been cleared in order to analyze the
overall traversal performance and the caching of previously
obtained results within one benchmark run.
The results depicted in Figure 6 unveil that only on first
time object graph traversal, SqueakSave suffers from the
currently missing support for eager loading of associations.
Hence, the associated objects for each of the sub parts have
to be obtained within multiple queries and can not be loaded
4.3
Development vs. Production Environment
The automatic creation of object-relational mapping descriptions is the main feature of SqueakSave. Due to the
reflection mechanisms used to create this behavior, performance is obviously an issue that has to be examined closely.
Therefore, the OO7 benchmark suite has been performed in
development and production mode. The following results
will reveal fields of usage where the automatic mapping behavior has a negative impact on the overall system performance, but also identify scenarios that are not affected by
it. Additionally, insights into potential optimization points
will be gained from those considerations.
Image 7 depicts the creation time for the small and tiny
database layout. It can be clearly apprehended that the inspection of every object that has to be stored within the
database slows down the overall performance. This is not
a very surprising fact, since not only does the framework
inspect each object, but also occasionally writes new descriptors to the image. Additionally, it has to check for
97
345ms
60ms
330ms
40ms
315ms
20ms
300ms
SqueakSave
0ms
GLORP
SqueakSave
Database Creation Time (ms) - Small Database
17s
525s
13s
350s
9s
175s
0s
that can be incorporated into future framework upgrades.
Database Creation Time (ms) - Tiny Database
700s
5,000ms
GLORP
• Much time of storing and query execution has been
spent on automatic retrieval of configuration objects
from the respective configuration classes. This is a
direct consequence of Squeak’s not incorporating categories as first class objects, and thus a time-consuming
lookup for the respective classes has to be performed.
4s
production
development
3,750ms
5,000ms
0s
production
development
3,750ms
Figure 7: Benchmark Database Creation Times for
SqueakSave Modes
2,500ms
2,500ms
1,250ms
1,250ms
0ms
SqueakSave
GLORP
0ms
Traversal 1
2s
105s
2s
70s
1s
35s
1s
Development
Production
0s
Traversal 2b
29s
21s
22s
14s
15s
7s
7s
Development
Production
Development
Production
• SqueakSave’s current handling of large result sets suffers from the creation of ineffectively sized collections.
While they provide a simple approach to the generation of objects from query results, their traversals are
not optimized if the size exceeds certain values. Therefore, smarter algorithms have to be developed that utilize the Squeak-internal limits for efficient collection
handling by splitting large result sets into smaller portions.
Traversal 2c
28s
0s
• Storing object ids in distinctive caches does not vastly
affect execution speed. However, upon large scale operations, such as the creation of the benchmark database, the impact remains perceivable, since the according caches also grow with the number of in-memory
objects.
GLORP
Traversal 2a
140s
0s
SqueakSave
0s
Query 1
Development
Production
Query 3
Figure 8: Benchmark Traversal Times for SqueakSave Modes
• Fine-grained save operations provide a viable means
for controlling database insertions and updates. However, to accommodate larger object models or collections of objects that have to be inserted, they perform
too many small queries to remain applicable. It is
therefore necessary to implement techniques allowing
for calling the save method on the root of an object
graph and combining insert and update operations in
few SQL queries.
and, if necessary, execute changes to the database schema.
The performance degradation also seems to remain constant
between the different benchmark scales, which implies that
the table and description creation and updates have a much
smaller impact on the performance, than the constant introspection measures. Obviously, after a very short period
of time, no more alterations of the two models are necessary, and thus the difference between the two modes grows
linearly.
While this slow-down might seem too high to be tolerated, developers should have to take into consideration that
creating the scale 1 data model suffices to generate a valid
database schema, that can be consecutively used to create
the data-structures for the small or even bigger benchmarks.
This, and the fact that the object-model can be developed
incrementally without the necessity to alter database structures explicitly, relativizes the obvious performance impact.
Query performance does not differ between the two modes,
since the synchronization between object model and database representation only takes place during object saving
and, accordingly, does not affect search queries.
During traversal measurements, however, the previously
observed differences still apply (see Figure 8). While the first
traversal is barely affected by the current execution mode,
changes to the object model (i.e., Traversals 2b+c) are performed much faster in production mode. It is therefore necessary for developers to thoughtfully utilize this feature if
performance is important. Especially the role-based choice
of the framework mode can provide a viable means for the
balance between execution time and object model flexibility.
4.4
• Regarding object graph traversal, eager object loading
is important. Future versions of the framework should
include this feature to minimize the number of SQL
statements required to obtain the entire object graph.
• During the execution of the benchmark in development
mode, it became apparent that preconditions for description and table update checks provide a vast performance improvement. Therefore, after the completion of the benchmark suite means have been integrated into the framework that not only prevent updates of descriptions and table structures, but also the
examination of their predecessors if it is not utterly
necessary.
4.5
Summary
The presented benchmark results have shown that SqueakSave still has to be optimized for certain fields of application.
Especially the query performance for large result sets is an
issue that deserves closer attention in the future. However,
object graph traversals are implemented in a viable manner
and the results demonstrate that the minimalist intrusion
into object models has a positive impact on such operations.
Additionally, the declarative nature of the query interface,
as well as the simple set-up and integration of the framework
are advantages that make SqueakSave a suitable persistence
solution for application development in Squeak.
Framework Profiling
The benchmark implementation and execution provided
a solid foundation for profiling the framework under a nontrivial workload. A couple of conclusions could be drawn,
98
5.
RELATED WORK
that has been the foundation for SqueakSave’s languagenative queries.
The special capabilities of dynamically-typed object-oriented programming environments like Squeak or other Smalltalk dialects affect the design and implementation of O/R
mapping solutions. While the possibility to analyze the
source code before program execution to determine the required table structure is missing, the often much more elaborate introspection and intercession features allow for more
flexible implementations. Within the scope of this paper,
only mappers for dynamically-typed object-oriented environments are considered. However, since the mapper is
to provide persistence in a manner reminiscent of objectoriented databases, examples of this category have also been
investigated with regards to their support for a relational
database foundation.
Object Databases.
The Gemstone project [5] provides almost transparent
persistence. However, it requires an extensive environment
in order to be applied as a persistence solution. It generally relies on object-oriented database technology to persist
application data, but additionally provides the means to integrate relational database management systems into the
storage process.
Another object-oriented database that provides compatibility with relational systems is db4o [24]. The db4o Replication System (dRS) utilizes Hibernate to replicate application data to specified RDBMS and is additionally able
to read data from relational databases. Thereby users are
able to perform ad-hoc SQL queries on the data without
having to utilize an environment capable of handling the
db4o-internal data structures. Additionally, this feature allows the integration of legacy data from relational database
into object-oriented environments.
Dynamic Object-Relational Mappers.
ActiveRecord for Ruby on Rails [12] is a database schemadriven O/R mapping solution that adheres to the convention
over configuration (CoC) principle [28]. While it provides
almost effortless configuration, database schemas and object
models are not automatically kept synchronized. Especially
alterations of the application object structure have to be
manifested in the database schema before they are available
in the respective object model and subject to persistence
mechanisms. ActiveRecord also introduced dynamic finder
methods as a language-native query interface for relational
databases.
DataMapper11 , another Ruby O/R mapping framework,
relies on mappings defined by a very minimalist API, that
only requires the definition of an SQL type for a certain
attribute in order to create a valid database schema. After each mapping change, a re-run of the database creation
method has to be performed, but will consecutively erase
the database completely and remove all data. However, the
framework also offers migrations, that can gradually add,
alter, or remove columns in existing database tables. The
query API is quite similar to the one present in ActiveRecord.
GLORP [18] provides object-relational persistence by heavy utilization of meta descriptions. These must follow certain naming conventions and have to be declared for the
model, the database tables, and the relation between model
attributes and database constructs. While GLORP allows
for comprehensive reverse mapping of legacy database structures, its addition to existing applications is impeded by the
mandatory introduction of an id instance variable to each
persisted model class, and the need to provide a complete
mapping description even for trivial cases.
IOSPersistent12 was following an approach similar to the
one taken by SqueakSave. It provided fully-automatic persistence for all subclasses of an abstract base class of the
framework and automatically created the according table
models. Due to its monolithic architecture, it was not extensible by simple means and additionally did not allow for
custom object-relational mapping descriptions. It has been
superseded by the ReServe13 project, that removed the automatic table creation, but in contrast simplified the creation
of custom mapping descriptions and introduced a query API,
6.
CONCLUSIONS
SqueakSave is a reflective object-relational mapper that
relieves developers of the task to manually maintain mappings between object models and relational database structures. Additionally, the framework is implemented in a way
that does not interfere with existing object models and thus
can be added almost transparently to existing solutions.
While those features provide an increased degree of flexibility, query and storage performance are slightly diminished. However, since the main goal of the implementation
has been to aid the development process of applications, the
decreased performance is a trade-off that is worthwhile with
regards to the gain in developer productivity.
The depicted extension points of the framework ought to
support the development of new and innovative ways to create specialized table structures and mapping description formats that can be easily integrated into the existing solution.
While the current version is able to compete with longestablished solutions, future work will especially involve the
optimization of queries that deliver large data sets and the
simultaneous insertion of multiple application objects within
a decreased amount of SQL statements.
Another important aspect for improvement is the provision of custom mapping description handlers. Thereby, the
seamless integration of SqueakSave into existing applications
can be vastly simplified by enabling the framework to utilize descriptions that have already been created for other
O/R mappers such as GLORP. Additionally, general purpose meta description frameworks, such as Magritte [27]
could be integrated to not only map objects to relational
constructs, but also generate validation methods that are
performed before the storing of objects.
Despite the obvious optimization and extension points
identified within this paper, other research projects could be
adopted to further minimize the intrusiveness of the framework into the application or further optimize the generation
of SQL queries. The former could be reached by utilizing
aspect-oriented constructs to provide the persistence functionality as an easily attachable aspect to existing applications [26]. The latter is possible by an in-depth analysis
of inner-application workflows, that determine the queries
11
http://www.datamapper.org
http://www.squeaksource.com/IOSPersistent.html
13
http://www.squeaksource.com/ReServe.html
12
99
most suitable within certain execution states [25].
SqueakSave provides a solid foundation for further research and shows that meta-programming and reflection are
viable to simplify the integration of object-relational persistence mechanisms into applications developed in dynamically-typed object-oriented programming environments.
7.
[17] E. Klimas, D. Thomas, and S. Skublics. Smalltalk with
style. Prentice Hall, Englewood Cliffs, NJ, 1996.
[18] Alan Knight. GLORP: generic lightweight
object-relational persistence. In OOPSLA ’00:
Addendum to the 2000 proceedings of the conference
on Object-oriented programming, systems, languages,
and applications (Addendum), pages 173–174, New
York, NY, USA, 2000. ACM.
[19] S.H. Lee, S.J. Kim, and W. Kim. The BORD
Benchmark for Object-Relational Databases. In DEXA
’00: Proceedings of the 11th International Conference
on Database and Expert Systems Applications, pages
6–20, London, UK, 2000. Springer-Verlag.
[20] U. Leser and F. Naumann. Informationsintegration:
Architekturen und Methoden zur Integration verteilter
und heterogener Datenquellen. Dpunkt Verlag, 2007.
[21] F. Lodhi and M.A. Ghazali. Design of a simple and
effective object-to-relational mapping technique. In
SAC ’07: Proceedings of the 2007 ACM symposium on
Applied computing, pages 1445–1449, New York, NY,
USA, 2007. ACM.
[22] S. Melnik, A. Adya, and P.A. Bernstein. Compiling
mappings to bridge applications and databases. ACM
Trans. Database Syst., 33(4):1–50, 2008.
[23] OMG. UML 2.0 Specification, 2005.
[24] J. Paterson, S. Edlich, H. Hörning, and R. Hörning.
The Definitive Guide to db4o. Apress, Berkely, CA,
USA, 2006.
[25] P. Pohjalainen and J. Taina. Self-configuring
object-to-relational mapping queries. In PPPJ ’08:
Proceedings of the 6th international symposium on
Principles and practice of programming in Java, pages
53–59, New York, NY, USA, 2008. ACM.
[26] A. Rashid and R. Chitchyan. Persistence as an aspect.
In AOSD ’03: Proceedings of the 2nd international
conference on Aspect-oriented software development,
pages 120–129, New York, NY, USA, 2003. ACM.
[27] L. Renggli. Magritte - Meta-Described Web
Application Development. Master’s thesis, Software
Composition Group, University of Berne, 2006.
[28] C. Richardson. ORM in Dynamic Languages. Queue,
6(3):28–37, 2008.
[29] D. Thomas. Ubiquitous applications: embedded
systems to mainframe. Commun. ACM,
38(10):112–114, 1995.
[30] D. Ungar and R.B. Smith. Self: The power of
simplicity. SIGPLAN Not., 22(12):227–242, 1987.
[31] P. Van Zyl, D.G. Kourie, and A. Boake. Comparing
the performance of object databases and ORM tools.
In SAICSIT ’06: Proceedings of the 2006 annual
research conference of the South African institute of
computer scientists and information technologists on
IT research in developing countries, pages 1–11,
Pretoria, Republic of South Africa, 2006.
REFERENCES
[1] S.W. Ambler. Designing a Robust Persistence Layer.
Softw. Dev., 6(2):73–75, 1998.
[2] S.W. Ambler. Agile Database Techniques. John Wiley
& Sons, 2003.
[3] R. Barcia, G. Hambrick, K.Brown, R.Peterson, and
K.S.Bhogal. Persistence in the Enterprise. IBM Press,
2008.
[4] A.P. Black, S. Ducasse, O. Nierstrasz, D. Pollet,
D. Cassou, and M. Denker. Squeak by Example.
Institute of Computer Science and Applied
Mathematics of the University of Bern, Switzerland,
2008.
[5] P. Butterworth, A. Otis, and J. Stein. The GemStone
object database management system. Commun. ACM,
34(10):64–77, 1991.
[6] M.J. Carey, D.J. DeWitt, and J.F. Naughton. The 007
Benchmark. In SIGMOD ’93: Proceedings of the 1993
ACM SIGMOD international conference on
Management of data, pages 12–21, New York, NY,
USA, 1993. ACM.
[7] M.J. Carey, D.J. DeWitt, J.F. Naughton,
M. Asgarian, P. Brown, J.E. Gehrke, and D.N. Shah.
The BUCKY object-relational benchmark. In
SIGMOD ’97: Proceedings of the 1997 ACM SIGMOD
international conference on Management of data,
pages 135–146, New York, NY, USA, 1997. ACM.
[8] W.R. Cook. Interfaces and specifications for the
Smalltalk-80 collection classes. SIGPLAN Not.,
27(10):1–15, 1992.
[9] W.R. Cook and C. Rosenberger. Native Queries for
Persistent Objects. Computer Languages, Systems &
Structures, 31:127–141, 2005.
[10] S. Ducasse, O. Nierstrasz, N. Schärli, R. Wuyts, and
A.P. Black. Traits: A mechanism for fine-grained
reuse. ACM Trans. Program. Lang. Syst.,
28(2):331–388, 2006.
[11] J. Elliott. Hibernate: A Developer’s Notebook. O’Reilly
Media, Inc., 2004.
[12] O. Fernandez. The Rails Way. Addison-Wesley, 2007.
[13] M. Fowler, D. Rice, M. Foemmel, E. Hieatt, R. Mee,
and R. Stafford. Patterns of Enterprise Application
Architecture. Addison-Wesley, 2002.
[14] E. Gamma, R. Helm, and J.M. Vlissides. Design
Patterns: Elements of Reusable Object-Oriented
Software. Addison-Wesley, 1995.
[15] T. Goldschmidt, R. Reussner, and J. Winzen. A case
study evaluation of maintainability and performance
of persistency techniques. In ICSE ’08: Proceedings of
the 30th international conference on Software
engineering, pages 401–410, New York, NY, USA,
2008. ACM.
[16] J. J. Hallett and A. J Kfoury. A formal semantics for
weak references. Technical report, Department of
Computer Science, Boston University, 2005.
100
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement