Volt DB Release Notes
Release Notes
Product
VoltDB
Version
5.0
Release Date
January 28, 2015
This document provides information about known issues and limitations to the current release of VoltDB. If you
encounter any problems not listed below, please be sure to report them to [email protected] Thank you.
What's New for VoltDB 5.0
VoltDB 5.0 is a major release. It consolidates many new features introduced over the past few months, such as additional SQL and SQL functions support, integration with Hadoop through the HTTP export connector, as well as JDBC
and Apache Kafka import utilities. But the major change in 5.0 is that you no longer need to compile an application
catalog before starting the database.
Creating a VoltDB database is now as easy as A, B, C:
A. Start a database cluster.
B. Load the classes for any Java stored procedures from a standard Jar file.
C. Enter your schema DDL using VoltDB's sqlcmd command line utility.
VoltDB supports standard SQL data definition language (DDL) statements for adding, removing, and modifying
schema objects on the fly, including CREATE, ALTER, and DROP. You can also declare and partition stored procedures interactively.
Use of interactive DDL makes creating and modifying a VoltDB database as easy as possible without sacrificing any
of its leading edge performance and scalability or enterprise-level durability and availability features.
Special Considerations for Existing Customers
The elimination of the application catalog is a significant change — and simplification — in the database design and
deployment process. Support for existing customers is and always will be a key goal for VoltDB. Therefore, use of
application catalogs, although deprecated, will continue to be supported for several releases to allow existing customers
to adjust their development and production processes.
That being said, migrating from the use of application catalogs to using interactive DDL is a very simple process. With
only a few exceptions, existing schema files used to compile application catalogs in previous versions work as-is as
input to sqlcmd in version 5.0. Also, existing application catalogs can be used as Jar files for loading stored procedures
in VoltDB 5.0. So in many cases, it is possible to use existing resources in the new system without any modifications
to source code or configuration files.
If you wish to continue using application catalogs, all you need to do is add the attribute schema="catalog"
to the <cluster> tag in the deployment file. That's it.
If existing customers want to migrate to using interactive DDL, in most cases it is simply a case of packaging the
Java stored procedure class files into a Jar file using the standard jar command rather than compiling them into an
application catalog. The only caveats you should be aware of are the following:
1.1.
Remove IMPORT CLASSES
Release Notes
The IMPORT CLASSES statement, which allowed the inclusion of supplemental class files (invoked by
stored procedures) into the application catalog, is not supported in interactive mode. So remove any IMPORT
CLASSES statements in your schema.
Instead, simply include the supplemental class files in the Jar file with the stored procedures. They are automatically included in the database and made accessible to stored procedures when the Jar file is loaded.
1.2.
Combine CREATE PROCEDURE and PARTITION PROCEDURE into a single statement for complex procedures
In certain cases, stored procedures must be single-partitioned because the queries the procedure contains are
too complex. For example, if a query joins two or more partitioned tables, the procedure itself must be partitioned and the tables must be joined on their partitioning columns. As a consequence, you cannot issue a
plain CREATE PROCEDURE statement for that procedure interactively, because procedures are, by default
multi-partitioned.
Instead, combine the CREATE PROCEDURE and PARTITION PROCEDURE statements into a single CREATE PROCEDURE statement using the PARTITION ON clause. For example, the following two statements:
CREATE PROCEDURE FROM CLASS acme.procs.GetStoreByRegion;
PARTITION PROCEDURE GetStoreByRegion ON TABLE store COLUMN region;
Can be combined into a single CREATE PROCEDURE statement:
CREATE PROCEDURE PARTITION ON TABLE store COLUMN region
FROM CLASS acme.procs.GetStoreByRegion;
Upgrading From Older Versions
The process for upgrading from a previous version of VoltDB is as follows:
1. Place the database in admin mode (using voltadmin pause).
2. Perform a manual snapshot of the database (using voltadmin save).
3. Shutdown the database (using voltadmin shutdown).
4. Upgrade the VoltDB software.
5. Restart the database (using the voltdb create action).
6. Reload any Java stored procedures and the database schema (using the sqlcmd directives load classes and file).
7. Restore the snapshot created in Step #2 (using voltadmin restore).
8. Return the database to normal operations (using voltadmin resume).
Changes Since the Last Release
Users of previous versions of VoltDB should take note of the following changes that might impact their existing
applications.
1. Release V5.0
1.1.
Interactive DDL
2
Release Notes
The major new feature in VoltDB 5.0 is the ability to enter data definition language (DDL) statements interactively. For example, using sqlcmd on the command line or the VoltDB Management Center SQL Query
interface. This makes the process of creating a database and defining the schema more flexible. As part of the
support for interactive DDL, the following features have been added:
• Support for the DROP and ALTER statements for removing and modifying existing schema objects
• The ability to combine the CREATE PROCEDURE and PARTITION PROCEDURE statements into a single CREATE PROCEDURE statement with a PARTITION ON clause
• A new system procedure, @UpdateClasses, for adding and removing classes
• Two corresponding sqlcmd directives, load classes and remove classes, perform this function from the
command line
Pleased note that processing DDL interactively can take longer than compiling an application catalog all at
once. This is most noticeable when processing a large schema and especially on a multi-node cluster (where
each change must be coordinated among the servers).
If you find entering DDL interactively too slow, it is possible to revert to precompiling the schema before
starting the database. You have two choices:
• You can return to using catalogs exclusively, by setting the schema="catalog" attribute in the deployment file.
• You can compile the initial schema as a catalog, start the database specifying the catalog on the voltdb
create command, but leave the deployment file unchanged. In this case, the database starts from the catalog,
but you can use interactive DDL to modify the schema and stored procedures once the database is running.
Performance improvements for processing large schemas interactively are expected in upcoming releases.
1.2.
Ability to "trim" rows using LIMIT PARTITION ROWS EXECUTE
The LIMIT PARTITION ROWS constraint now supports an EXECUTE clause that lets you specify a DELETE
statement that is executed when the constraint value is exceeded. The EXECUTE clause gives you the ability to
automatically "prune" older data when the constraint is reached. See the description of the CREATE TABLE
statement in the Using VoltDB manual for details.
1.3.
Support for HttpFS targets in Hadoop export
The HTTP connector, now supports Apache HttpFS (Hadoop HDFS over HTTP) servers as a target when
exporting using the WebHDFS protocol. Set the export property httpfs.enable to "true" when exporting
to HttpFS servers.
1.4.
Addition of the ORDER BY clause to the DELETE statement
It is now possible to use the ORDER BY clause with LIMIT and/or OFFSET when performing a DELETE
operation. ORDER BY allows you to more selectively remove database rows. For example, the following
DELETE query removes the five oldest records, based on a timestamp column:
DELETE FROM events ORDER BY event_time ASC LIMIT 5;
Note that DELETE queries that include the ORDER BY clause must be single-partitioned and the ORDER
BY clause must be deterministic. See the description of the DELETE statement in the Using VoltDB manual
for details.
1.5.
Bug fixes
3
Release Notes
In addition to the new features listed above, VoltDB V5.0 includes fixes to several known issues:
• Previously, there was an undocumented limit of 200 kilobytes to the size of the parameter list on the JSON
interface. This limit has been extended to 2 megabytes.
Known Limitations
The following are known limitations to the current release of VoltDB. Workarounds are suggested where applicable.
However, it is important to note that these limitations are considered temporary and are likely to be corrected in future
releases of the product.
1. Command Logging
1.1.
Command logs can only be recovered to a cluster of the same size.
To ensure complete and accurate restoration of a database, recovery using command logs can only be performed
to a cluster with the same number of unique partitions as the cluster that created the logs. If you restart and
recover to the same cluster with the same deployment options, there is no problem. But if you change the
deployment options for number of nodes, sites per host, or K-safety, recovery may not be possible.
For example, if a four node cluster is running with four sites per host and a K-safety value of one, the cluster
has two copies of eight unique partitions (4 X 4 / 2). If one server fails, you cannot recover the command
logs from the original cluster to a new cluster made up of the remaining three nodes, because the new cluster
only has six unique partitions (3 X 4 / 2). You must either replace the failed server to reinstate the original
hardware configuration or otherwise change the deployment options to match the number of unique partitions.
(For example, increasing the site per host to eight and K-safety to two.)
1.2.
Do not use the subfolder name "segments" for the command log snapshot directory.
VoltDB reserves the subfolder "segments" under the command log directory for storing the actual command
log files. Do not add, remove, or modify any files in this directory. In particular, do not set the command log
snapshot directory to a subfolder "segments" of the command log directory, or else the server will hang on
startup.
2. Database Replication
2.1.
Node failure and rejoin on the replica during csvload operations can cause uncaught data duplication
If a node on the replica database fails while the master is loading data with the csvloader (or its associated bulk
loading methods), when the node rejoins it is possible data already loaded gets reloaded during the rejoin. This
can cause divergence between the master and replica databases.
To be safe until this limitation is corrected, if a node on the replica database fails while the master database
is bulk loading data, you should stop the replica and the DR agent and restart replication once the bulk load
is complete.
3. Export
3.1.
Dropping export tables, reconfiguring export, then adding export tables can cause unpredictable results.
It is possible to reconfigure export while the database is running as long as no export tables exist. Once export
tables are defined, you cannot modify the export configuration without restarting the database. However, if
you start export, then delete the export tables (using DROP TABLE), the database lets you modify the export
configuration. The issue is that declaring export tables after this change does not produce the expected export.
Also, command logs for the database may not be recoverable.
4
Release Notes
As a general rule, do not reconfigure export on a running database using voltadmin update or @UpdateApplicationCatalog until this bug is fixed.
3.2.
Synchronous export in Kafka can use up all available file descriptors and crash the database.
A bug in the Apache Kafka client can result in file descriptors being allocated but not released if the
producer.type attribute is set to "sync" (which is the default). The consequence is that the system eventually
runs out of file descriptors and the VoltDB server process will crash.
Until this bug is fixed, use of synchronous Kafka export is not recommended. The workaround is to set the
Kafka producer.type attribute to "async" using the VoltDB export properties.
4. SQL and Stored Procedures
4.1.
Comments containing unmatched single quotes in multi-line statements can produce unexpected results.
When entering a multi-line statement at the sqlcmd prompt, if a line ends in a comment (indicated by two
hyphens) and the comment contains an unmatched single quote character, the following lines of input are not
interpreted correctly. Specifically, the comment is incorrectly interpreted as continuing until the next single
quote character or a closing semi-colon is read. This is most likely to happen when reading in a schema file
containing comments. This issue is specific to the sqlcmd utility.
A fix for this condition is planned for an upcoming point release
4.2.
Do not use assertions in VoltDB stored procedures.
VoltDB currently intercepts assertions as part of its handling of stored procedures. Attempts to use assertions
in stored procedures for debugging or to find programmatic errors will not work as expected.
4.3.
The UPPER() and LOWER() functions currently convert ASCII characters only.
The UPPER() and LOWER() functions return a string converted to all uppercase or all lowercase letters, respectively. However, for the initial release, these functions only operate on characters in the ASCII character
set. Other case-sensitive UTF-8 characters in the string are returned unchanged. Support for all case-sensitive
UTF-8 characters will be included in a future release.
5. Client Interfaces
5.1.
Avoid using decimal datatypes with the C++ client interface on 32-bit platforms.
There is a problem with how the math library used to build the C++ client library handles large decimal values
on 32-bit operating systems. As a result, the C++ library cannot serialize and pass Decimal datatypes reliably
on these systems.
Note that the C++ client interface can send and receive Decimal values properly on 64-bit platforms.
6. Enterprise Manager
Important
The VoltDB Enterprise Manager is part of the VoltDB Enterprise Edition and continues to be supported for
customers who are currently using it. However, due to limitations in its implementation, no further development work is being done on the Enterprise Manager and it is not recommended for new deployments. The
Enterprise Manager's functionality will be replaced by new, more robust, deployment and management capabilities in the future.
5
Release Notes
6.1.
Manual snapshots not copied to the Management Server properly.
Normally, manual snapshots (those created with the Take a Snapshot button) are copied to the management
server. However, if automated snapshots are also being created and copied to the management server, it is
possible for an automated snapshot to override the manual snapshot.
If this happens, the workaround is to turn off automated snapshots (and their copying) temporarily. To do
this, uncheck the box for copying snapshots, set the frequency to zero, and click OK. Then re-open the Edit
Snapshots dialog and take the manual snapshot. Once the snapshot is complete and copied to the management
server (that is, the manual snapshot appears in the list on the dialog box), you can re-enable copying and
automated snapshots.
6.2.
Old versions of Enterprise Manager files are not deleted from the /tmp directory
When the Enterprise Manager starts, it unpacks files that the web server uses into a subfolder of the /tmp
directory. It does not delete these files when it stops. Under normal operation, this is not a problem. However,
if you upgrade to a new version of the Enterprise Edition, files for the new version become intermixed with
the older files and can result in the Enterprise Manager starting databases using the wrong version of VoltDB.
To avoid this situation, make sure these temporary files are deleted before starting a new version of VoltDB
Enterprise Manager.
The /tmp directory is emptied every time the server reboots. So the simplest workaround is to reboot your
management server after you upgrade VoltDB. Alternately, you can delete these temporary files manually by
deleting the winstone subfolders in the /tmp directory:
$ rm -vr /tmp/winstone*
6.3.
Enterprise Manager configuration files are not upwardly compatible.
When upgrading VoltDB Enterprise Edition, please note that the configuration files for the Enterprise Manager
are not upwardly compatible. New product features may make existing database and/or deployment definitions
unusable. It is always a good idea to delete existing configuration information before upgrading. You can delete
the configuration files by deleting the ~/.voltdb directory. For example:
$ rm -vr ~/.voltdb
6.4.
Enterprise Manager cannot start two databases on the same server.
In the past, it was possible to run two (or more) databases on a single physical server by defining two logical
servers with the same IP address and making the ports for each database unique. However, as a result of internal
optimizations introduced in VoltDB 2.7, this technique no longer works when using the Enterprise Manager.
We expect to correct this limitation in a future release. Note that it is still possible to start multiple databases
on a single server manually using the VoltDB shell commands.
6.5.
The Enterprise Manager cannot restart and recover a replica database as a master.
Using the VoltDB Enterprise Manager, if a replica database was started with command logging, then stopped
(intentionally or by accident), the Enterprise Manager cannot restart the database as a normal database using the
recover action to reinstate the database's previous state. The Enterprise Manager can restore from a snapshot.
Implementation Notes
The following notes provide details concerning how certain VoltDB features operate. The behavior is not considered
incorrect. However, this information can be important when using specific components of the VoltDB product.
6
Release Notes
1. VoltDB Management Center
1.1.
Schema updates clear the stored procedure data table in the Management Center Monitor section
Any time the database schema or stored procedures are changed, the data table showing stored procedure
statistics at the bottom of the Monitor section of the VOltDB Management Center get reset. As soon as new
invocations of the stored procedures occur, the statistics table will show new values based on performance after
the schema update. Until invocations occur, the procedure table is blank.
2. SQL
2.1.
Do not use UPDATE to change the value of a partitioning column
For partitioned tables, the value of the column used to partition the table determines what partition the row
belongs to. If you use UPDATE to change this value and the new value belongs in a different partition, the
UPDATE request will fail and the stored procedure will be rolled back.
Updating the partition column value may or may not cause the record to be repartitioned (depending on the
old and new values). However, since you cannot determine if the update will succeed or fail, you should not
use UPDATE to change the value of partitioning columns.
The workaround, if you must change the value of the partitioning column, is to use both a DELETE and an
INSERT statement to explicitly remove and then re-insert the desired rows.
2.2.
Certain SQL syntax errors result in the error message "user lacks privilege or object not found" when compiling
the runtime catalog.
If you refer to a table or column name that does not exist, the VoltDB compiler issues the error message "user
lacks privilege or object not found". This can happen, for example, if you misspell a table or column name.
Another situation where this occurs is if you mistakenly use double quotation marks to enclose a string literal
(such as WHERE ColumnA="True"). ANSI SQL requires single quotes for string literals and reserves double
quotes for object names. In the preceding example, VoltDB interprets "True" as an object name, cannot resolve
it, and issues the "user lacks privilege" error.
The workaround is, if you receive this error, to look for misspelled table or columns names or string literals
delimited by double quotes in the offending SQL statement.
3. Runtime
3.1.
File Descriptor Limits
VoltDB opens a file descriptor for every client connection to the database. In normal operation, this use of
file descriptors is transparent to the user. However, if there are an inordinate number of concurrent client
connections, or clients open and close many connections in rapid succession, it is possible for VoltDB to exceed
the process limit on file descriptors. When this happens, new connections may be rejected or other disk-based
activities (such as snapshotting) may be disrupted.
In environments where there are likely to be an extremely large number of connections, you should consider
increasing the operating system's per-process limit on file descriptors.
3.2.
Protecting VoltDB Against Port Scanners
VoltDB uses a number of different ports for interprocess communication as well as features such as HTTP
access, DR, and so on. Port scanning software often interferes with normal operation of such ports by sending
bogus data to them in an attempt to identify open ports.
7
Release Notes
VoltDB has hardened its port usage to ignore unexpected or irrelevant data from port scanners. However, the
ports used for Database Replication (DR) cannot be protected in this way. So, in V4.6, a Java property was
introduced to allow you to disable the DR ports, for situations where port scanning cannot be avoided. To
disable the DR ports, set the Java property VOLTDB_DISABLE_DR to true before starting the database
process. For example:
$ export VOLTDB_OPTS="-DVOLTDB_DISABLE_DR=true"
$ voltdb create myapplication.jar \
--deployment=deployment.xml \
--host=voltsvr1
Note that, if you disable the DR ports, you cannot use the database as a master for database replication.
8
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement