advertisement
Loading Data
Aster Loader Tool
Here’s how we run Aster Loader, piping its input data through sed:
$ cat sampleData-3.tsv \
| sed -e 's_\\_\\\\_g' \
| ncluster_loader -h $QUEEN_IP -d my_db -U beehive -w beehive testo / dev/stdin
Loading tuples using node '192.168.28.100'.
3 tuples were successfully loaded into table 'test'.
Here are the result rows:
$ act -h $SYSMAN_IP -d my_db -U beehive -w beehive -c 'SELECT * FROM testo ORDER BY id;'
id | string
----+---------------------------------------------------------------------
1 | This is just a line.
5 | How often do back-slash characters ('\') appear in your data?
6 | And how often do you think they actually disappear: 1 \? 2 \? 3 \?
7 | \W\a\y \t\o\o \o\f\t\e\n\! \! \!
(4 rows)
Example with Error Logging
Use the same assumptions as in the previous example, and assume we will log malformed rows (that is, rows that the loader cannot interpret and therefore cannot load) to a table called
“
2010MarchSales_error_table
,” tagging each error row with the label
“
2010MarchSalesErr
”. At the end of the load attempt, the error data will also be copied to the file,
/home/ccrisp/2010MarchSales_error.txt
. We’ll set a limit of 100 error rows; if more than 100 errors are encountered, the load will be cancelled.
To do this:
1
Create the custom error logging table: Run ACT as a user with table creation rights (for example, a user with the catalog_admin role) and type:
CREATE TABLE 2010MarchSales_error_table () INHERITS
(nc_errortable_part);
2
Exit ACT, return to the command line, and type:
$ ./ncluster_loader -h 10.50.25.100 –w beehive -D "~" --el-enabled
--el-label
2010MarchSalesErr --el-limit 100 --el-table
2010MarchSales_error_table --el-errfile /home/ccrisp/
2010MarchSales_error.txt sales_fact 2010MarchSales_data.txt
For more information on logging malformed rows, see
.
Hints for Successful Loading
Recommended Character Set Is UTF-8
The default character set for Aster databases is UTF-8, and the Aster team recommends that you load only UTF-8 formatted data when loading to char, varchar, and text columns.
For the tools you use to prepare files and to connect to Aster, make sure you have set the default character set to UTF-8. This is particularly important if you are loading data from a
Windows-based machine. For example, if you will use an SSH client (e.g., putty) to run ncluster_loader, make sure you set the SSH client’s default character set to UTF-8.
Aster Client Guide 201
Loading Data
Aster Loader Tool
We recommend that, prior to loading, you convert your text files to UTF-8. For example, if you’re a Notepad++ user, you can use the command, “Convert to UTF8 without BOM.”
Newline Character
Make sure your data file uses a consistent character to represent newlines. If the file uses
\r\n for newlines, then it should not also use
\n
for newlines, and vice versa. If your file contains both UNIX-style
\n
newlines and Windows-style
\r\n
newlines, then you must clean the file before you try to load it. The UNIX command, dos2unix
, can be useful for doing this.
Multiple Loader Nodes
The Aster Loader Tool supports the use of many Aster Loader nodes. For most loading tasks, the queen is sufficient to handle all loading, but for high volume loading, you can add dedicated loader nodes to your cluster.
To use a loader node, you invoke one or more ncluster_loader instances that will load through that loader node. You may run many ncluster_loader sessions in parallel against one loader node, and you may use many loader nodes in parallel (with each node handling loads from a number of ncluster_loader instances).
To do this, you invoke each ncluster_loader instance with the
-l
(and optionally
-f
) argument to specify the loader node. The required flags are:
• as always, the
--hostname
option (
-h
) provides the queen IP address;
• the
--loader
flag (
-l
) provides the IP address of the desired loader node; and
• Optionally, the
--force-loader
flag (
-f
) forces the use of the desired loader node.
Loading Parent Child Tables with Inheritance
The
--auto-partition
option is retained in order to support parent/child tables created with inheritance. It is not used when working with parent/child tables created with autopartitioning. Using
--auto-partition
instructs the Aster Loader Tool to automatically send each row to the right child table during loading. Each row is directed to a table according to the check constraints you have set up on the child tables.
For example, if you partition your data into daily child tables based on the contents of a timestamp
column, each ultimate child table in your schema will have a CHECK constraint that specifies what value of timestamp
may be loaded into that child table. When you load data, the autopartitioning feature will route each row to the appropriate child table, based on its timestamp
value.
Use autopartitioning like this:
202 Aster Client Guide
Aster Client Guide
Loading Data
Aster Loader Tool
1
Set up the parent-child table schema in your database. On each ultimate child table, write a CHECK constraint that specifies what data may be loaded into that child table.
Notice!
Aster Database does not detect overlapping constraints on peer child tables. As a result, the correct placement of a row during loading can be indeterminate.
Workaround: Take care that the constraints you define do not create overlapping logical partitions. A simple mistake would be to set up range constraints like this:
CHECK ( ymdh BETWEEN '2005-07-01' AND '2005-08-01' );
CHECK ( ymdh BETWEEN '2005-08-01' AND '2005-09-01' );
In this example, it is not clear in which partition the ymdh value '2005-08-01' resides.
2
3
Prepare your data for loading:
a
Your data input file can contain data values that will end up in many different child tables.
b
To handle rows that do not fit your partitioning scheme, you can rely on the standard
error logging feature of the Aster Loader Tool (see “Error Logging” on page 204
) or create a check constraint that will catch rows that you do not want to include in your partitions.
Run the Aster Loader Tool with the
-a or
--auto-partition
flag.
Detecting UNIQUE and PRIMARY KEY Violations Before Loading
Detecting UNIQUE and PRIMARY KEY violations in the data to be loaded is not always straightforward. In many cases the source is not a database you can easily run a query on to detect non-unique keys. Some techniques you can use to detect these conditions in your source data:
• Build a version of the target table in the target database without a UNIQUE or PRIMARY
KEY constraint, load the data, then run a “detect duplicates” query to find the problematic rows/keys. In some cases only loading a sample of the data is sufficient to provide enough clues to find and fix the problem in the source data.
• An alternative step (using an ETL tool) would be to use this “keyless” version of the target table as a staging/temp table, which would load then check for issues like duplicate keys and dump them to a second error table. If no issues are found, then transfer the data to the final destination table.
• If the source is a database, then run the “detect duplicates” query there.
Using COPY with Columnar Tables
A loading operation using the Aster Loader Tool, COPY, or INSERT can be expensive when the following conditions exist:
• the target table uses columnar storage, AND
• the target table has many logical partitions, AND
• the loaded data matches many different logical partitions.
203
advertisement
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
advertisement
Table of contents
- 10 Conventions Used in This Guide
- 10 Typefaces
- 10 SQL Text Conventions
- 11 Command Shell Text Conventions
- 11 Contact Teradata Global Technical Support (GTS)
- 11 About Teradata Aster
- 11 About This Document
- 12 Version History
- 14 Aster Client Support Matrices
- 14 Aster Client Platform and OS Support Matrix
- 17 Aster ODBC Driver Support Matrix
- 19 Obtaining Aster Client Packages
- 19 Installing the Aster Database Cluster Terminal (ACT)
- 20 Installing ACT on Windows
- 20 Installing ACT on Linux
- 21 Installing ACT on Mac OS
- 21 Installing ACT on Solaris
- 21 Installing ACT on AIX
- 22 Configuring ACT for the Aster File Store
- 23 Installing and Configuring ODBC
- 23 ODBC Driver Manager Compatibility
- 23 Optional ODBC Setting for varchar Data
- 24 Install ODBC on Windows
- 27 Install ODBC on Linux, Solaris, or Mac OS
- 29 Install ODBC on AIX
- 29 Configure DataDirect Driver Manager on Linux and AIX
- 31 Install ODBC on Solaris
- 31 Configure DataDirect Driver Manager on Solaris
- 38 Installing the .NET Data Provider for Aster
- 38 Prerequisites
- 38 Procedure
- 40 Installing the Loader Tool
- 41 Installing the Export Tool
- 41 Installing Teradata Wallet
- 41 Download Teradata Wallet
- 41 Install and Configure Teradata Wallet on Linux
- 43 Install and Configure Teradata Wallet on Windows
- 45 Teradata Wallet
- 46 Wallet Contents
- 46 Teradata Wallet Commands
- 47 Usage
- 48 Authentication Cascading
- 48 Prerequisites
- 48 Authentication Cascading Continuity
- 49 Using Single Sign-on (SSO)
- 49 Configuring Single Sign-on (SSO) with SSL on the Queen
- 49 Configuring the Registry Key for JDBC on Windows
- 49 ODBC with SSO Client-Side Settings
- 50 Adding AD-Based SSO Authentication to ODBC Connections with SSL
- 50 Using SSO with ACT
- 51 SSL Security Basics
- 51 SSL Port Number
- 51 SSL-Related Files and Settings
- 51 SSL Settings on the Queen Reference
- 52 Setting Configuration Parameters on the Queen
- 53 Creating Certificates
- 54 SSL Basics for ODBC
- 54 Setting SSL Parameters for the ODBC Client
- 59 SSL Security Scenarios
- 60 Scenario 1: Queen Provides a Self-Signed Certificate
- 62 Scenario 2: Client Must Have a CA-Signed Copy of the Server’s Certificate
- 64 Scenario 3: Client CA-signed Certificate Must Match the Queen Certificate
- 67 Scenario 4: Encrypting Communication from the Queen to the Client
- 69 Scenario 5: Client has a Copy of the Certificate You Provide
- 73 ACT Quick Start
- 74 Launching ACT
- 74 Launching ACT on Windows
- 75 Launching ACT on Linux, Solaris or AIX
- 75 Launching ACT on Mac
- 75 Launching ACT Directly on the Queen
- 75 Logging In to ACT
- 76 Startup Parameters for ACT
- 79 Using the "on-error-stop" Option in ACT
- 80 Using a Configuration File to Pass ACT Startup Parameters
- 82 Using ACT
- 82 Issuing SQL Queries
- 84 Exit ACT
- 84 Page Through Query Results
- 84 Throttle Query Results in ACT and Aster Database
- 87 ACT Utility Commands
- 87 Repeat Previously Typed Commands
- 87 Tab Completion
- 88 ACT Commands (at the SQL Prompt)
- 94 Aster File Store Commands
- 94 Specifying a URI or Path
- 95 AFS Command Reference
- 98 Java Properties for AFS Clients
- 102 Setting Database Parameters
- 103 Troubleshooting ACT
- 103 ACT Connection Hangs When Using SSL
- 103 Invalid User Name Error in ACT After Password Change
- 103 Misleading Error Message Reports Problem With a Role Instead of With a User
- 105 General Tips for Connecting Clients to Aster Database
- 105 Recommended Character Set Is UTF
- 106 Supported Encoding
- 106 When Querying System Tables with ODBC, Set AUTOCOMMIT to 'OFF
- 106 ODBC Driver
- 106 Using an ODBC Configuration File or Connection String
- 106 Enable Authentication Cascading
- 107 ODBC Usage Notes
- 108 Set Up ODBC for Perl Connectivity on Linux
- 109 Set up ODBC for PHP
- 110 JDBC Driver
- 110 Aster JDBC Driver
- 111 Differences from the Legacy JDBC Driver
- 111 Before You Start
- 111 Install the JDBC Driver
- 112 Use the JDBC Driver in a Java Application
- 113 Parameters for Connecting through JDBC
- 114 Enable Authentication Cascading
- 114 Configuring the JDBC Log Settings
- 115 Behavior and Performance Settings for JDBC
- 119 Cancel
- 121 Supported SQL Commands
- 124 Using Client-Side Cursors in JDBC
- 126 Test JDBC Connect Program
- 128 Configure Aster Database SQL Settings
- 128 SQL Behavior Parameters
- 128 Setting the SQL Behavior Parameters
- 129 Syntax for ODBC Commands
- 130 Process SQL Statements in JDBC
- 130 Process a Simple Query in JDBC
- 131 JDBC Troubleshooting and Limitations
- 132 Connect Reporting Tools to Aster Database
- 132 Connect Aqua Data Studio to Aster Database
- 133 Connect MicroStrategy to Aster Database
- 135 Loading Data with the SSIS .NET Data Provider for Aster
- 135 Overview
- 135 Procedure
- 147 Using the Teradata Aster Connector for SSIS
- 147 Teradata Aster SSIS Connector Features
- 147 Connection Managers
- 148 Data Type Mapping
- 152 Installing the Teradata Aster SSIS Connector
- 153 Creating an Integration Services Project
- 156 Using SSIS Connector
- 172 Internationalization and Locale support
- 173 Example: Using Aster Export Source and Aster Loader Destination
- 178 Working with SSIS Connector Solution Packages
- 180 Possible Exceptions and Resolutions for .NET
- 181 Best Practices for Data Loading
- 181 Loading Terminology
- 182 Scenario 1: Pre-Production Data Loading
- 183 Scenario 2: Loading in a Production Environment
- 186 Loading Best Practices Summary
- 186 Aster Loader Tool
- 186 Syntax
- 188 Argument Flags
- 193 Exit Status
- 194 Loading Data with the Loader Tool
- 195 Removing Nulls from Data with Aster Loader Tool on Linux
- 196 Removing NULLs from Data
- 196 Loading from Multiple Files Using a Map File
- 200 Examples
- 201 Hints for Successful Loading
- 204 Error Logging
- 205 Troubleshoot Loading
- 205 Running Multiple Loaders Concurrently
- 205 Load Stalls Upon Cancellation or Failure Encountered During a Load
- 206 Load Fails on UNIQUE or PRIMARY KEY Violation
- 206 Invalid Input Syntax Error
- 206 Single Quote Character Must be Escaped When Using the -q Option
- 206 Using the -C Option With Uppercase or Special Characters
- 207 Uppercase Characters are Passed as Lowercase if not Escape Quoted
- 207 Issues Using Escape Characters
- 209 Aster Export Tool
- 209 Synopsis
- 211 Argument Flags