Cloudera Manager Administration Guide

Cloudera Manager Administration Guide
Important Notice
(c) 2010-2015 Cloudera, Inc. All rights reserved.
Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service
names or slogans contained in this document are trademarks of Cloudera and its
suppliers or licensors, and may not be copied, imitated or used, in whole or in part,
without the prior written permission of Cloudera or the applicable trademark holder.
Hadoop and the Hadoop elephant logo are trademarks of the Apache Software
Foundation. All other trademarks, registered trademarks, product names and
company names or logos mentioned in this document are the property of their
respective owners. Reference to any products, services, processes or other
information, by trade name, trademark, manufacturer, supplier or otherwise does
not constitute or imply endorsement, sponsorship or recommendation thereof by
us.
Complying with all applicable copyright laws is the responsibility of the user. Without
limiting the rights under copyright, no part of this document may be reproduced,
stored in or introduced into a retrieval system, or transmitted in any form or by any
means (electronic, mechanical, photocopying, recording, or otherwise), or for any
purpose, without the express written permission of Cloudera.
Cloudera may have patents, patent applications, trademarks, copyrights, or other
intellectual property rights covering subject matter in this document. Except as
expressly provided in any written license agreement from Cloudera, the furnishing
of this document does not give you any license to these patents, trademarks
copyrights, or other intellectual property. For information about patents covering
Cloudera products, see http://tiny.cloudera.com/patents.
The information in this document is subject to change without notice. Cloudera
shall not be liable for any damages resulting from technical errors or omissions
which may be present in this document, or from use of this document.
Cloudera, Inc.
1001 Page Mill Road Bldg 2
Palo Alto, CA 94304
info@cloudera.com
US: 1-888-789-1488
Intl: 1-650-362-0488
www.cloudera.com
Release Information
Version: 5.0.x
Date: September 8, 2015
Table of Contents
About this Guide.........................................................................................................7
Managing the Cloudera Manager Server and Agents............................................9
Starting, Stopping, and Restarting the Cloudera Manager Server.........................................................9
Configuring Cloudera Manager Server Ports............................................................................................9
Moving the Cloudera Manager Server to a New Host.............................................................................9
Starting, Stopping, and Restarting Cloudera Manager Agents.............................................................10
Configuring Cloudera Manager Agents...................................................................................................11
Viewing Cloudera Manager Server and Agent Logs...............................................................................13
Changing Hostnames................................................................................................................................14
Backing up Databases...............................................................................................................................16
Backing Up PostgreSQL Databases .....................................................................................................................16
Backing Up MySQL Databases..............................................................................................................................17
Backing Up Oracle Databases...............................................................................................................................17
Cloudera Management Service...............................................................................19
Managing Users and Authentication.....................................................................23
Cloudera Manager User Accounts...........................................................................................................23
Changing the Local Logged-In User Password...................................................................................................23
Adding a Local User Account.................................................................................................................................23
Changing a User Account Role and Password....................................................................................................24
Deleting Local User Accounts................................................................................................................................24
Configuring External Authentication.......................................................................................................24
Configuring Authentication Using Active Directory............................................................................................25
Configuring Authentication Using an OpenLDAP-compatible Server..............................................................25
Configuring Authentication Using an External Program....................................................................................26
Configuring Authentication Using SAML................................................................................................27
Preparing Files........................................................................................................................................................27
Configuring Cloudera Manager.............................................................................................................................28
Configuring the IDP................................................................................................................................................29
Verifying Authentication and Authorization........................................................................................................29
Configuring TLS Security for Cloudera Manager...................................................31
Configuring TLS Encryption only for Cloudera Manager........................................................................31
Step 1: Create a Cloudera Manager Server certificate. ......................................................................................31
Step 2: Enable TLS encryption and specify Server keystore properties............................................................32
Step 3: Enable and configure TLS on the Agent hosts........................................................................................32
Step 4: Restart the Cloudera Manager Server.....................................................................................................32
Step 5: Restart the Cloudera Manager Agents....................................................................................................32
Step 6: Verify that the Server and Agents are communicating.........................................................................33
Configuring TLS Authentication of Server to Agents.............................................................................33
Step 1: Configure TLS encryption..........................................................................................................................33
Step 2: Provide the Server's server certificate and CA certificate......................................................................33
Step 3: Copy the Server's server .pem file to the Agents....................................................................................35
Step 4: Enable TLS Encryption in Cloudera Manager..........................................................................................36
Step 5: Restart the Cloudera Manager Server.....................................................................................................36
Step 6: Restart the Cloudera Manager Agents....................................................................................................36
Step 7: Verify that the Server and Agents are communicating.........................................................................36
Configuring TLS Authentication of Agents to Server.............................................................................36
Step 1: Configure TLS encryption..........................................................................................................................36
Step 2: Configure TLS Authentication of Server to Agents.................................................................................36
Step 3. Generate the private key for the Agent using openssl..........................................................................37
Step 4: Generate a certificate for the agent.........................................................................................................37
Step 5: Create a file that contains the password for the key.............................................................................37
Step 6: Configure the Agent with its private key and certificate.......................................................................37
Step 7: Import the Agent's certificate into the Server's truststore....................................................................37
Step 8: Repeat steps 3 through 7 for every agent in your cluster.....................................................................37
Step 9: Enable Agent authentication and configure the Server to use the new truststore...........................37
Step 10: Restart the Server....................................................................................................................................38
Step 11: Restart the Cloudera Manager Agents..................................................................................................38
Step 12: Verify that the Server and Agents are communicating.......................................................................38
Configuring TLS Encryption for Cloudera Manager Admin Console.....................................................38
Step 1: Create a Cloudera Manager Server certificate. ......................................................................................38
Step 2: Enable TLS encryption and specify Server keystore properties............................................................39
Step 3: Restart the Cloudera Manager Server.....................................................................................................39
Step 4: Restart the Cloudera Management Services..........................................................................................39
Step 5: Verify that the Server and browser are using TLS to communicate....................................................39
Upgrading Cloudera Manager.................................................................................41
Database Considerations for Cloudera Manager Upgrades..................................................................42
Back up Databases.................................................................................................................................................42
Modify Databases to Support UTF-8....................................................................................................................42
Modify Databases to Support Appropriate Maximum Connections.................................................................43
Next Steps...............................................................................................................................................................44
Upgrading Cloudera Manager 5 to the Latest Cloudera Manager.......................................................44
Review Warning......................................................................................................................................................44
Perform Prerequisite Steps...................................................................................................................................44
Stop Selected Services...........................................................................................................................................44
Stop Cloudera Manager Server, Database, and Agent........................................................................................45
(Optional) Upgrade JDK on Cloudera Manager Server Host and Agent Hosts.................................................45
Upgrade Cloudera Manager Server Packages.....................................................................................................45
Start the Cloudera Manager Server......................................................................................................................47
Upgrade Cloudera Manager Agent Packages......................................................................................................48
Verify the Upgrade Succeeded..............................................................................................................................50
Start Selected Services...........................................................................................................................................50
Deploy Updated Client Configurations.................................................................................................................50
Test the Installation...............................................................................................................................................50
(Optional) Upgrade CDH.........................................................................................................................................50
Upgrading Cloudera Manager 4 to Cloudera Manager 5.......................................................................50
Review Warnings and Notes.................................................................................................................................51
Perform Prerequisite Steps...................................................................................................................................52
Stop Selected Services...........................................................................................................................................53
Stop Cloudera Manager Server, Database, and Agent........................................................................................53
(Optional) Upgrade JDK on Cloudera Manager Server Host and Agent Hosts.................................................54
Upgrade Cloudera Manager Server Packages.....................................................................................................54
Start the Cloudera Manager Server......................................................................................................................56
Upgrade Cloudera Manager Agent Packages......................................................................................................56
Verify the Upgrade Succeeded..............................................................................................................................59
Add Hive Gateway Roles........................................................................................................................................59
Configure Cluster Version for Package Installs...................................................................................................60
Upgrade Impala.......................................................................................................................................................60
(Optional) Hard Restart Cloudera Manager Agents............................................................................................60
(Optional) Restart All Services...............................................................................................................................60
Restart Roles of Audited Services........................................................................................................................60
Start Selected Services...........................................................................................................................................61
Deploy Updated Client Configurations.................................................................................................................61
Test the Installation...............................................................................................................................................61
(Optional) Upgrade CDH.........................................................................................................................................61
Upgrading Cloudera Manager 3.7.x..........................................................................................................61
Re-Running the Cloudera Manager Upgrade Wizard............................................................................61
Reverting a Failed Cloudera Manager Upgrade......................................................................................62
Reinstall the Cloudera Manager Server Packages..............................................................................................62
Start the Server.......................................................................................................................................................64
Other Cloudera Manager Settings..........................................................................65
Administration Settings............................................................................................................................65
User Interface Language Settings...........................................................................................................66
Managing Licenses....................................................................................................................................66
Managing Alerts.........................................................................................................................................69
Configuring Alert Email Delivery...........................................................................................................................69
Configuring Alert SNMP Delivery..........................................................................................................................70
Kerberos......................................................................................................................................................71
Sending Usage and Diagnostic Data to Cloudera...................................................................................72
Managing Anonymous Usage Data Collection....................................................................................................72
Managing Hue Analytics Data Collection.............................................................................................................72
Diagnostic Data Collection.....................................................................................................................................72
Importing Cloudera Manager Settings....................................................................................................75
Backing up your Current Deployment..................................................................................................................75
Building a Cloudera Manager Deployment..........................................................................................................75
Uploading a Cloudera Manager 4.x Configuration Script....................................................................................75
About this Guide
About this Guide
This guide is for system administrators who need to manage a Cloudera Manager server installation. This guide
covers managing the Cloudera Manager Server and Agents, managing the Cloudera Management Service, adding
and managing Cloudera Manager users, configuring TLS security, upgrading Cloudera Manager, adding or upgrading
licenses, configuring the Alert Publisher, and other similar features.
Cloudera Manager Administration Guide | 7
Managing the Cloudera Manager Server and Agents
Managing the Cloudera Manager Server and Agents
This section covers information on managing the Cloudera Manager Server and Agents that run on each host
of the cluster.
Starting, Stopping, and Restarting the Cloudera Manager Server
To start the Cloudera Manager Server:
$ sudo service cloudera-scm-server start
You can stop (for example, to perform maintenance on its host) or restart the Cloudera Manager Server without
affecting the other services running on your cluster. Statistics data used by activity monitoring and service
monitoring will continue to be collected during the time the server is down.
To stop the Cloudera Manager Server:
$ sudo service cloudera-scm-server stop
To restart the Cloudera Manager Server:
$ sudo service cloudera-scm-server restart
Configuring Cloudera Manager Server Ports
1. From the Administration tab, select Settings.
2. Under the Ports and Addresses category, set the following options as described below:
Setting
Description
HTTP Port for Admin Console
Specify the HTTP port to use to access the Server via
the Admin Console.
HTTPS Port for Admin Console
Specify the HTTPS port to use to access the Server
via the Admin Console.
Agent Port to connect to Server
Specify the port for Agents to use to connect to the
Server.
3. Click Save Changes.
4. Restart the Cloudera Manager Server.
Moving the Cloudera Manager Server to a New Host
You can move the Cloudera Manager Server if the database information is still available for either of the following
reasons:
• The database server is still available.
• A current back up of the Cloudera Manager database is available.
To move Cloudera Manager Server:
Cloudera Manager Administration Guide | 9
Managing the Cloudera Manager Server and Agents
1. Record the old host's name hostname and IP address. It is not absolutely necessary to have the old Cloudera
Manager server hostname and IP address, but it simplifies the process. You could use a new hostname and
IP address, but this would require updating the configuration of every Agent to use this new information.
Because it is easier to use the old server hostname and address in most cases, using a new hostname and
IP address is not described.
2. Identify a new host on which to install Cloudera Manager. Assign the failed Cloudera Manager Server's
hostname and IP address to the new host.
Note: If the Agents were configured with the server's hostname, you do not need to assign the
old host's IP address to the new host. Simply assigning the hostname will suffice.
3. Install Cloudera Manager on a new host, using the method described under Install the Cloudera Manager
Server Packages. Do not install the other components, such as CDH and databases.
4. If the database server is not available,
a. Install the database packages on the host that will host the restored database. This could be the same
host on which you have just installed Cloudera Manager or it could be a different host. The details of
which package to install varies based on which database was initially installed on your system. If you
used the embedded PostgreSQL database, install the PostgreSQL package as described in Embedded
PostgreSQL Database. If you used an external MySQL, PostgreSQL, or Oracle database, reinstall that
following the instructions in Cloudera Manager and Managed Service Databases.
b. Restore the backed up databases to the new database installations.
5. Update /etc/cloudera-scm-server/db.properties with the necessary information so that the Cloudera
Manager Server connects to the restored database. This information is typically the database name, database
instance name, user name, and password.
6. Start the Cloudera Manager Server.
At this point, Cloudera Manager should resume functioning as it did before the failure. Because you restored
the database from the backup, the server should accept the running state of the Agents, meaning it will not
terminate any running processes.
The process is similar with secure clusters, though files in /etc/cloudera-scm-server must be restored in
addition to the database.
Starting, Stopping, and Restarting Cloudera Manager Agents
To start stopped Agents, the supervisord process, and all processes managed by the supervisord process,
use one of the following commands:
• Start
$ sudo service cloudera-scm-agent start
• Clean Start
$ sudo service cloudera-scm-agent clean_start
The directory /var/run/cloudera-scm-agent is completely cleaned out; all files and subdirectories are
removed and then the start command is executed. /var/run/cloudera-scm-agent contains on-disk
running Agent state. Some Agent state is left behind in /var/lib/cloudera-scm-agent, but you shouldn't
delete that. For further information, see Server and Client Configuration and Process Management.
To stop or restart Agents while leaving the processes they manage running, use one of the following commands:
• Stop
$ sudo service cloudera-scm-agent stop
10 | Cloudera Manager Administration Guide
Managing the Cloudera Manager Server and Agents
• Restart
$ sudo service cloudera-scm-agent restart
To stop or restart Agents, the supervisord process, and all processes managed by the supervisord process,
use one of the following commands:
• Hard Stop
$ sudo service cloudera-scm-agent hard_stop
• Hard Restart
$ sudo service cloudera-scm-agent hard_restart
Hard restart is useful for the following situations:
1. You're upgrading Cloudera Manager and the supervisord code has changed between your current version
and the new one. To properly do this upgrade you'll need to restart supervisor too.
2. supervisord is hung and needs to be restarted.
3. You want to clear out all running state pertaining to Cloudera Manager and managed services.
• Clean Restart
$ sudo service cloudera-scm-agent clean_restart
Runs hard_stop followed by clean_start.
Warning: Running hard_stop, clean_restart, or hard_restart kills all running managed services
on the host(s) where the command is run.
To check the status of the Agent process, use the command:
$ sudo service cloudera-scm-agent status
Configuring Cloudera Manager Agents
Cloudera Manager Agents can be configured globally using properties you set in the Cloudera Manager Admin
Console and by setting properties in individual Agent configuration files.
Configuring Agent Heartbeat and Health Status Options
You can configure the Cloudera Manager Agent heartbeat interval and timeouts to trigger changes in Agent
health as follows:
1. Select Administration > Settings.
2. Under the Performance category, set the following option:
Property
Description
Send Agent Heartbeat Every
The interval in seconds between each heartbeat that is sent from
Cloudera Manager Agents to the Cloudera Manager Server.
Default: 15 sec.
3. Under the Monitoring category, set the following options:
Cloudera Manager Administration Guide | 11
Managing the Cloudera Manager Server and Agents
Property
Description
Set health status to Concerning if The number of missed consecutive heartbeats after which a Concerning
the Agent heartbeats fail
health status is assigned to that Agent.
Default: 5.
Set health status to Bad if the
Agent heartbeats fail
The number of missed consecutive heartbeats after which a Bad health
status is assigned to that Agent.
Default: 10.
4. Click Save Changes.
Configuring the Host Parcel Directory
To configure the location of distributed parcels:
1. Click Hosts in the top navigation bar.
2. Select Configuration > View and Edit.
3. Configure the value of the Parcel Directory property. The setting of the parcel_dir property in the Cloudera
Manager Agent configuration file overrides this setting.
4. Click Save Changes.
Agent Configuration File
The Cloudera Manager Agent supports different types of configuration options in the
/etc/cloudera-scm-agent/config.ini file. You must update the configuration on each host.
Property
Description
server_host, server_port,
listening_port,
listening_hostname,
listening_ip
Hostname and ports of the Cloudera Manager Server and Agent and IP
address of the Agent. Also see Configuring Cloudera Manager Server Ports
on page 9 and Ports Used by Cloudera Manager.
The Cloudera Manager Agent configures its hostname automatically.
However, if your cluster hosts are multi-homed (that is, they have more
than one hostname), and you want to specify which hostname the
Cloudera Manager Agent uses, you can update the listening_hostname
property. If you want to specify which IP address the Cloudera Manager
Agent uses, you can update the listening_ip property in the same file.
To have a CNAME used throughout instead of the regular hostname, an
Agent can be configured to use listening_hostname=CNAME. In this
case, the CNAME should resolve to the same IP address as the IP address
of the hostname on that machine. Users doing this will find that the host
inspector will report problems, but the CNAME will be used in all
configurations where that's appropriate. This practice is particularly useful
for users who would like clients to use
namenode.mycluster.company.com instead of
machine1234.mycluster.company.com. In this case,
namenode.mycluster would be a CNAME for machine1234.mycluster,
and the generated client configurations (and internal configurations as
well) would use the CNAME.
log_file
The path to the Agent log file. If the Agent is being started via the init.d
script, /var/log/cloudera-scm-agent/cloudera-scm-agent.out will
also have a small amount of output (from before logging is initialized).
Default: /var/log/cloudera-scm-agent/cloudera-scm-agent.log.
12 | Cloudera Manager Administration Guide
Managing the Cloudera Manager Server and Agents
Property
Description
lib_dir
Directory to store Cloudera Manager Agent state that persists across
instances of the agent process and system reboots. The agent's UUID is
stored here.
Default: /var/lib/cloudera-scm-agent.
parcel_dir
Directory to store unpacked parcels.
Default: /opt/cloudera/parcels.
max_collection_wait_seconds
Maximum time to wait for all metric collectors to finish collecting data.
Default: 10 sec.
metrics_url_timeout_seconds
Maximum time to wait when connecting to a local role's web server to
fetch metrics.
Default: 30 sec.
task_metrics_timeout_seconds Maximum time to wait when connecting to a local TaskTracker to fetch
task attempt data.
Default: 5 sec.
use_tls,verify_cert_file,
client_key_file,
client_keypw_file,
client_cert_file
Security-related configuration. See
mgmt_home
Directory to store Cloudera Management Service files.
•
•
•
•
Configuring TLS Authentication of Agents to Server on page 36
Configuring TLS Authentication of Server to Agents on page 33
Specifying the Cloudera Manager Server Certificate
Adding a Host to the Cluster
Default: /usr/share/cmf.
cloudera_mysql_connector_jar, Location of JDBC drivers. See Cloudera Manager and Managed Service
cloudera_oracle_connector_jar, Databases.
cloudera_postgresql_jdbc_jar
Default:
• MySQL - /usr/share/java/mysql-connector-java.jar
• Oracle - /usr/share/java/oracle-connector-java.jar
• PostgreSQL /usr/share/cmf/lib/postgresql-version-build.jdbc4.jar
Viewing Cloudera Manager Server and Agent Logs
To help you troubleshoot problems, you can view the Cloudera Manager Server and Agent logs. You can view
these logs in the Logs page or in specific pages for the logs.
Viewing Cloudera Manager Server and Agent Logs in the Logs Page
1.
2.
3.
4.
Select Diagnostics > Logs on the top navigation bar.
Click Select Sources to display the log source list.
Uncheck the All Sources checkbox.
Check the Cloudera Manager checkbox to view both Agent and Server logs, or click to the left of Cloudera
Manager, and check either the Agent or Server checkbox.
Cloudera Manager Administration Guide | 13
Managing the Cloudera Manager Server and Agents
5. Click Search.
For more information about the Logs page, see Logs.
Viewing the Cloudera Manager Server Log
1. Select Diagnostics > Server Log on the top navigation bar.
Note: You can also view the Cloudera Manager Server log at
/var/log/cloudera-scm-server/cloudera-scm-server.log on the server host.
Viewing the Cloudera Manager Agent Log
1.
2.
3.
4.
Click the Hosts tab.
Click the link for the host where you want to see the Agent log.
In the Details panel, click the Details link in the Host Agent field.
Click the Agent Log link.
Note: You can also view the Cloudera Manager Agent log at
/var/log/cloudera-scm-agent/cloudera-scm-agent.log on the Agent hosts.
Changing Hostnames
Important: The process described here requires Cloudera Manager and cluster downtime.
After you have installed Cloudera Manager and created a cluster, you may need to update the names of the
hosts running the Cloudera Manager Server or cluster services. To update a deployment with new hostnames,
follow these steps:
1. Verify if SSL/TLS certificates have been issued for any of the services and make sure to create new SSL/TLS
certificates in advance for services protected by TLS/SSL. Review Cloudera Manager and CDH documentation
at Cloudera Documentation.
Tip: Search for SSL and TLS in the documentation.
2. Export the Cloudera Manager configuration using one of the following methods:
• Open a browser and go to this URL http://cm_hostname:7180/api/api_version/cm/deployment.
Save the displayed configuration.
• From terminal type:
$ curl -u admin:admin http://cm_hostname:7180/api/api_version/cm/deployment >
cme-cm-export.json
If Cloudera Manager SSL is in use, specify the -k switch:
$ curl -k -u admin:admin http://cm_hostname:7180/api/api_version/cm/deployment >
cme-cm-export.json
3.
4.
5.
6.
where cm_hostname is the name of the Cloudera Manager host and api_version is the correct version of the
API for the version of Cloudera Manager you are using. For example,
http://tcdn5-1.ent.cloudera.com:7180/api/v6/cm/deployment.
Stop all services on the cluster.
Stop the Cloudera Management Service.
Stop the Cloudera Manager Server.
Stop the Cloudera Manager Agents on the hosts that will be having the hostname changed.
14 | Cloudera Manager Administration Guide
Managing the Cloudera Manager Server and Agents
7. Back up the Cloudera Manager Server database using mysqldump, pg_dump, or another preferred backup
utility. Store the backup in a safe location.
8. Update names and principals:
a. Update the target hosts using standard per-OS/name service methods (/etc/hosts, dns,
/etc/sysconfig/network, hostname, and so on). Ensure that you remove the old hostname.
b. If you are changing the hostname of the host running Cloudera Manager Server do the following:
a. Change the hostname per step 8.1.
b. Update the Cloudera Manager hostname in /etc/cloudera-scm-agent/config.ini on all Agents.
c. If the cluster is configured for Kerberos security, do the following:
a. In the Cloudera Manager database, set the merged_keytab value:
• PostgreSQL
update roles set merged_keytab=NULL;
• MySQL
update ROLES set MERGED_KEYTAB=NULL;
b. Remove old hostname cluster service principals from the KDC database using one of the following:
• Use the delprinc command within kadmin.local interactive shell.
• From the command line:
kadmin.local -q "listprincs" | grep -E
"(HTTP|hbase|hdfs|hive|httpfs|hue|impala|mapred|solr|oozie|yarn|zookeeper)[^/]*/
[^/]*@" > cluster-princ.txt
Open cluster-princ.txt and remove any non-cluster service principal entries within it. Make
sure that the default krbtgt and other principals you created, or were created by Kerberos by
default, are not removed by running the following: for i in `cat cluster-princ.txt`; do
yes yes | kadmin.local -q "delprinc $i"; done.
c. Within the Cloudera Manager Admin Console recreate all the principals based on the new hostnames:
a. Select Administration > Kerberos.
b. Do one of the following:
• If there are no principals listed, click the Generate Principals button.
• If there are principals listed, click the top checkbox to select all principals and click the Regenerate
button.
9. Start the Cloudera Manager database and Cloudera Manager Server.
10. Start the Cloudera Manager Agents on the newly renamed hosts. The Agents should show a current heartbeat
in Cloudera Manager.
11. If one of the hosts that was renamed has a NameNode configured with High Availability and automatic
failover enabled, reconfigure the ZooKeeper failover controller znodes to reflect the new hostname.
Warning:
• Do not perform this step if you are also running JobTracker in a High Availability configuration,
as clearing the hadoop-ha znode will negatively impact JobTracker HA.
• All other services, and most importantly HDFS, should not be running.
Cloudera Manager Administration Guide | 15
Managing the Cloudera Manager Server and Agents
a. Start ZooKeeper services.
Note: Make sure the ZooKeeper Failover Controller role is stopped within the HDFS service;
start only the ZooKeeper Server role instances.
b. On one of the hosts that has a ZooKeeper Server role, log into the Zookeeper CLI to delete the Nameservice
znode:
• On a package-based installation zkCli.sh is found at: /usr/lib/zookeeper/bin/zkCli.sh
• On a parcel-based installation zkCli.sh is found at:
$/opt/cloudera/parcels/CDH/lib/zookeeper/bin/zkCli.sh
a. Verify that the HA znode exists: zkCli$ ls /hadoop-ha
b. Delete the old znode: zkCli$ rmr /hadoop-ha/nameservice1
c. In the Cloudera Manager Admin Console, go to the HDFS service.
d. Click the Instances tab.
e. Select Actions > Initialize High Availability State in ZooKeeper....
12. For each of the Cloudera Management Service roles, go to their configuration and update the Database
Hostname property.
13. Start all cluster services.
14. Start the Cloudera Management Service.
Backing up Databases
Cloudera recommends that you periodically back up the databases that Cloudera Manager uses to store
configuration, monitoring, and reporting data and for managed services that require a database:
• Cloudera Manager - Contains all the information about what services you have configured, their role
assignments, all configuration history, commands, users, and running processes. This is a relatively small
database (<100MB), and is the most important to back up. A monitoring database contains monitoring
information about service and host status. In large clusters, this database can grow large.
• Activity Monitor - Contains information about past activities. In large clusters, this database can grow large.
• Report Manager - Keeps track of disk utilization and processing activities over time. Medium-sized.
• Cloudera Navigator - Contains auditing information. In large clusters, this database can grow large.
• Hive Metastore - Contains Hive metadata. Relatively small.
Backing Up PostgreSQL Databases
The procedure for backing up a PostgreSQL database depends on whether you are using an embedded or external
database.
Backing up Embedded PostgreSQL Databases
After stopping the database, back up the /var/lib/cloudera-scm-server-db directory.
Backing up External PostgreSQL Databases
Use the pg_dump utility:
1. Log in to the host where the Cloudera Manager Server is installed.
2. Run the following command as root:
cat /etc/cloudera-scm-server/db.properties.
The db.properties file contains:
# Auto-generated by scm_prepare_database.sh
16 | Cloudera Manager Administration Guide
Managing the Cloudera Manager Server and Agents
# Mon Jul 27 22:36:36 PDT 2011
com.cloudera.cmf.db.type=postgresql
com.cloudera.cmf.db.host=localhost:7432
com.cloudera.cmf.db.name=scm
com.cloudera.cmf.db.user=scm
com.cloudera.cmf.db.password=NnYfWIjlbk
3. Run the following command as root using the parameters from the preceding step:
# pg_dump -h localhost -p 7432 -U scm > /tmp/scm_server_db_backup.$(date +%Y%m%d)
4. Enter the password specified for the com.cloudera.cmf.db.password property on the last line of the
db.properties file. If you are using the embedded database, Cloudera Manager generated the password
for you during installation. If you are using an external database, enter the appropriate information for your
database.
Backing Up MySQL Databases
To back up the MySQL database, run the mysqldump command on the MySQL host, as follows:
$ mysqldump -hhostname -uusername -ppassword database > /tmp/database-backup.sql
For example, to back up the Activity Monitor database amon on the local host as the root user, with the password
amon_password:
$ mysqldump -pamon_password amon > /tmp/amon-backup.sql
To back up the sample Activity Monitor database amon on remote host myhost.example.com as the root user,
with the password amon_password:
$ mysqldump -hmyhost.example.com -uroot -pcloudera amon > /tmp/amon-backup.sql
Backing Up Oracle Databases
For Oracle, work with your database administrator to ensure databases are properly backed up.
Cloudera Manager Administration Guide | 17
Cloudera Management Service
Cloudera Management Service
The Cloudera Management Service implements various management features as a set of roles:
• Activity Monitor - collects information about activities run by the MapReduce service
• Host Monitor - collects health and metric information about hosts
• Service Monitor - collects health and metric information about services and activity information from the
YARN and Impala services
• Event Server - aggregates relevant Hadoop events and makes them available for alerting and searching
• Alert Publisher - generates and delivers alerts for certain types of events
• Reports Manager - generates reports that provide an historical view into disk utilization by user, user group,
and directory, processing activities by user and YARN pool, and HBase tables and namespaces.
Cloudera Manager manages each role separately, instead of as part of the Cloudera Manager Server, for scalability
(for example, on large deployments it's useful to put the monitor roles on their own hosts) and isolation.
In addition, for certain editions of the Cloudera Enterprise license, the Cloudera Management Service provides
the Navigator Audit Server and Navigator Metadata Server (beta) roles for Cloudera Navigator.
Displaying the Cloudera Management Service Status
1. Do one of the following:
• Select Clusters > Cloudera Management Service > mgmt.
• On the Status tab of the Home page, in Cloudera Management Service table, click the mgmt link.
Starting the Cloudera Management Service
1. Do one of the following:
• 1. Select Clusters > Cloudera Management Service > mgmt.
2. Select
Actions > Start.
• 1.
On the Home page, click
to the right of mgmt and select Start.
2. Click Start to confirm. The Command Details window shows the progress of starting the roles.
3. When Command completed with n/n successful subcommands appears, the task is complete. Click Close.
Stopping the Cloudera Management Service
1. Do one of the following:
• 1. Select Clusters > Cloudera Management Service > mgmt.
2. Select
Actions > Stop.
• 1.
On the Home page, click
to the right of mgmt and select Stop.
2. Click Stop to confirm. The Command Details window shows the progress of stopping the roles.
3. When Command completed with n/n successful subcommands appears, the task is complete. Click Close.
Restarting the Cloudera Management Service
1. Do one of the following:
• 1. Select Clusters > Cloudera Management Service > mgmt.
Cloudera Manager Administration Guide | 19
Cloudera Management Service
2. Select
• 1.
Actions > Restart.
On the Home page, click
to the right of mgmt and select Restart.
2. Click Restart to confirm. The Command Details window shows the progress of stopping and then starting
the roles.
3. When Command completed with n/n successful subcommands appears, the task is complete. Click Close.
Configuring Management Service Database Limits
Each Cloudera Management Service role maintains a database for retaining the data it monitors. These databases
(as well as the log files maintained by these services) can grow quite large. For example, the Activity Monitor
maintains data at the service level, the activity level (MapReduce jobs and aggregate activities), and at the task
attempt level. Limits on these data sets are configured when you create the management services, but you can
modify these parameters through the Configuration settings in the Cloudera Manager Admin Console. For
example, the Event Server lets you set a total number of events to store, and Activity Monitor gives you "purge"
settings (also in hours) for the data it stores.
There are also settings for the logs that these various services create. You can throttle how big the logs are
allowed to get and how many previous logs to retain.
1. Do one of the following:
• Select Clusters > Cloudera Management Service > mgmt.
• On the Status tab of the Home page, in Cloudera Management Service table, click the mgmt link.
2. Select Configuration > View and Edit.
3. In the left-hand column, select the Default role group for the role whose configurations you want to modify.
4. Edit the appropriate properties:
• Activity Monitor - the Purge or Expiration period properties are found in the top-level settings for the
role.
• Host and Service Monitor - see Data Storage for Monitoring Data.
• Log Files - log file size settings will be under the Logs category under the role group.
5. Click Save Changes.
Adding and Starting Cloudera Navigator Roles
1. Do one of the following:
• Select Clusters > Cloudera Management Service > mgmt.
• On the Status tab of the Home page, in Cloudera Management Service table, click the mgmt link.
2. Click the Instances tab and click the Add button. The Customize Role Assignments page displays.
3. Customize the assignment of role instances to hosts. The wizard evaluates the hardware configurations of
the hosts to determine the best hosts for each role. The wizard assigns all worker roles to the same set of
hosts to which the HDFS DataNode role is assigned. These assignments are typically acceptable, but you
can reassign services to hosts of your choosing, if desired.
Click a field below a role to display a dialog containing a pageable list of hosts. If you click a field containing
multiple hosts, you can also select All Hosts to assign the role to all hosts or Custom to display the pageable
hosts dialog.
The following shortcuts for specifying host names are supported:
• Range of hostnames (without the domain portion)
Range Definition
Matching Hosts
10.1.1.[1-4]
10.1.1.1, 10.1.1.2, 10.1.1.3, 10.1.1.4
20 | Cloudera Manager Administration Guide
Cloudera Management Service
Range Definition
Matching Hosts
host[1-3].company.com
host1.company.com, host2.company.com, host3.company.com
host[07-10].company.com
host07.company.com, host08.company.com, host09.company.com,
host10.company.com
• IP addresses
• Rack name
Click the View By Host button for an overview of the role assignment by host ranges.
4. When you are satisfied with the assignments, click Continue. The Database Setup page displays.
5. Configure settings for required databases:
a. Choose the database type:
• Leave the default setting of Use Embedded Database to have Cloudera Manager create and configure
all required databases. Make a note of the auto-generated passwords.
• Select Use Custom Databases to specify external databases. Enter the database host, database type,
database name, username, and password for the databases that you created when you set up databases
for Cloudera Manager.
1. Provide information for the Activity Monitor (only needed when using MapReduce), Reports Manager,
and Hive Metastore, and Cloudera Navigator databases. The value you enter as the database
hostname must match the value you entered for the hostname (if any) when you created the
database.
b. Click Test Connection to confirm that Cloudera Manager can communicate with the databases using the
information you have supplied. If the test succeeds in all cases, click Continue; otherwise check and correct
the information you have provided for the databases and then try the test again. (For Hive, if you are
using the embedded database, you will see a message saying the database will be created at a later point
in the installation process.) The Review Changes page displays.
6. Review and accept any configuration changes (typically there are none). Click Accept. This returns you to the
Instances page.
7. Check the checkboxes next to navigator and navigatormetaserver (Flex Edition or Data Hub Edition only) .
8. Select Actions for Selected > Start and confirm Start in the pop-up.
9. Click Close.
Cloudera Manager Administration Guide | 21
Managing Users and Authentication
Managing Users and Authentication
This chapter covers managing user accounts, and configuring external authentication.
• Cloudera Manager User Accounts on page 23
• Configuring External Authentication on page 24
Cloudera Manager User Accounts
Access to Cloudera Manager features is controlled by user accounts. A user account identifies how a user is
authenticated and determines what privileges are granted to the user.
You manage user accounts through the Administration > Users page.
When you are logged in to the Cloudera Manager Admin Console, the username you are logged in as is at the
far right of the top navigation bar—for example, if you are logged in as admin you will see
.
User Authentication
User authentication can be done through a local database, through an external LDAP directory server (Active
Directory or OpenLDAP-compatible), SAML, or through an external authentication program of your own choosing.
Users accounts added in the Users page (which are stored in the local database) show Cloudera Manager in the
User Type column. User accounts added from an LDAP directory or other external authentication mechanism
will have External in the User Type column. See Configuring External Authentication on page 24 for information
on configuring Cloudera Manager to use an external LDAP directory, SAML, or other authentication program for
user authentication.
User Roles
A user role determines what Cloudera Manager features are accessible to the user and what actions the user
can perform. A user account can be assigned one of three roles:
• Administrator - Allows the user to add, change, delete, and configure services or administer user accounts.
Also, even if you are using an external authentication mechanism for user authentication, users with
Administrator privileges can log in to Cloudera Manager using their local Cloudera Manager username and
password. (This prevents the system from locking everyone out if the external authentication settings get
misconfigured.)
• Limited Administrator - Allows the user to view service and monitoring information and decommission hosts,
but cannot add services or take any other actions that affect the state of the cluster.
• Read-Only - Allows the user to view service and monitoring information but cannot add services or take any
actions that affect the state of the cluster.
Changing the Local Logged-In User Password
1. Right-click the logged-in username at the far right of the top navigation bar and select Change Password.
2. Enter the current password, and a new password twice and then click Submit.
Adding a Local User Account
1.
2.
3.
4.
Select Administration > Users.
Click the Add User button.
Enter a username and password.
Optionally, specify the desired role for the new user.
Cloudera Manager Administration Guide | 23
Managing Users and Authentication
5. Click Submit.
Changing a User Account Role and Password
Changing An Account Role
1.
2.
3.
4.
Select Administration > Users.
Check the checkbox next to the username.
In the Assign Role: field, select a role from the drop-down list.
Click the Assign Role: button.
Changing a Password for a Local User Account
1.
2.
3.
4.
Select Administration > Users.
Click the Change Password button next to a username with User Type Cloudera Manager.
Type the new password and repeat it to confirm.
Click the Submit button to make the change.
Deleting Local User Accounts
1. Check the checkbox next to one or more usernames with User Type Cloudera Manager.
2. Click the Delete button. (There is no confirmation of the action.)
Configuring External Authentication
Important: This feature is available only with a Cloudera Enterprise license.
For other licenses, the following applies:
• Cloudera Express - the feature is not available.
• Cloudera Enterprise Data Hub Edition Trial - the feature will not be available after you end the
trial or the trial license expires.
To obtain a license for Cloudera Enterprise, please fill in this form or call 866-843-7207. After you
install a Cloudera Enterprise license, the feature will be available.
Cloudera Manager provides several different mechanisms for authenticating users. You can add users in the
Cloudera Manager Admin Console Users page, which adds them to the Cloudera Manager database (the default)
or configure Cloudera Manager to authenticate against an external authentication service. This can be an LDAP
server (Active Directory or an OpenLDAP compatible directory), or you can specify another external service. If
you are using LDAP or an external service you can configure Cloudera Manager so that it can use both methods
of authentication (internal database or external service), and you can determine the order in which it performs
these searches. You can also restrict login access to members of specific groups, and can specify groups whose
members will automatically be given administrator access to Cloudera Manager. Cloudera Manager also supports
using the Security Assertion Markup Language (SAML) to enable single sign-on.
Configuring an External Authentication Service for Authentication
1. Select Administration > Settings.
2. In the left-hand column, select the External Authentication category.
3. Select the order in which Cloudera Manager should attempt its authentication (Authentication Backend
Order). Here you can choose to authenticate users using just one of the methods (using Cloudera Manager's
own database is the default), or you can set it so that if the user cannot be authenticated by the first method,
it will attempt using the second method. If you select External Only, users who are administrators in the
Cloudera Manager database will still be able to log in with their database password. This is to prevent the
24 | Cloudera Manager Administration Guide
Managing Users and Authentication
system from locking everyone out if the authentication settings get misconfigured — such as with a bad
LDAP URL.
4. Go to the section for the type of authentication you want to configure, and follow the steps to set the properties
appropriately:
•
•
•
•
Configuring Authentication Using Active Directory on page 25
Configuring Authentication Using an OpenLDAP-compatible Server on page 25
Configuring Authentication Using an External Program on page 26
Configuring Authentication Using SAML on page 27
Configuring Authentication Using Active Directory
1.
2.
3.
4.
For External Authentication Type select Active Directory.
Provide the URL of the Active Directory server.
Provide the NT domain to authenticate against.
Optionally, provide a comma-separated list of LDAP group names in the LDAP User Groups property. If this
list is provided, only users who are members of one or more of the groups in the list will be allowed to log
into Cloudera Manager. If this property is left empty, all authenticated LDAP users will be able to log into
Cloudera Manager. For example, if there is a group called
"CN=ClouderaManagerUsers,OU=Groups,DC=corp,DC=com", add the group name ClouderaManagerUsers to
the LDAP User Groups list to allow members of that group to log in to Cloudera Manager. The group names
are case-sensitive.
5. In the LDAP Administrator Groups property, you can provide a list of groups whose members should be given
administrator access when they log in to Cloudera Manager. (admin users must also be a member of at least
one of the groups specified in the LDAP User Groups property or they will not be allowed to log in.) If this is
left empty, then no users will be granted administrator access automatically at login—administrator access
will need to be granted manually by another administrator.
6. In the LDAP Limited Administrator Groups property, you can provide a list of groups whose members should
be given limited administrator access when they log in to Cloudera Manager. Users who are members of one
of the configured groups will be granted limited admin access upon logging in.
Configuring Authentication Using an OpenLDAP-compatible Server
For an OpenLDAP-compatible directory, you have several options for searching for users and groups:
• You can specify a single base Distinguished Name (DN) and then provide a "Distinguished Name Pattern" to
use to match a specific user in the LDAP directory.
• Search filter options let you search for a particular user based on somewhat broader search criteria – for
example Cloudera Manager users could be members of different groups or organizational units (OUs), so a
single pattern won't find all those users. Search filter options also let you find all the groups to which a user
belongs, to help determine if that user should have login or admin access.
1. For External Authentication Type select LDAP.
2. Provide the URL of the LDAP server and (optionally) the base Distinguished Name (DN) (the search base) as
part of the URL — for example ldap://ldap-server.corp.com/dc=corp,dc=com.
3. If your server does not allow anonymous binding, provide the user DN and password to be used to bind to
the directory. These are the LDAP Bind User Distinguished Name and LDAP Bind Password properties. By
default, Cloudera Manager assumes anonymous binding.
4. To use a single "Distinguished Name Pattern," provide a pattern in the LDAP Distinguished Name Pattern
property.
Use {0} in the pattern to indicate where the username should go. For example, to search for a distinguished
name where the the uid attribute is the username, you might provide a pattern similar to
uid={0},ou=People,dc=corp,dc=com. Cloudera Manager substitutes the name provided at login into this
pattern and performs a search for that specific user. So if a user provides the username "foo" at the Cloudera
Manager login page, Cloudera Manager will search for the DN uid=foo,ou=People,dc=corp,dc=com.
Cloudera Manager Administration Guide | 25
Managing Users and Authentication
If you provided a base DN along with the URL, the pattern only needs to specify the rest of the DN pattern.
For example, if the URL you provide is ldap://ldap-server.corp.com/dc=corp,dc=com, and the pattern
is uid={0},ou=People, then the search DN will be uid=foo,ou=People,dc=corp,dc=com.
5. You can also search using User and/or Group search filters, using the LDAP User Search Base, LDAP User
Search Filter, LDAP Group Search Base and LDAP Group Search Filter settings. These allow you to combine
a base DN with a search filter to allow a greater range of search targets.
For example, if you want to authenticate users who may be in one of multiple OUs, the search filter mechanism
will allow this. You can specify the User Search Base DN as dc=corp,dc=com and the user search filter as
uid={0}. Then Cloudera Manager will search for the user anywhere in the tree starting from the Base DN.
Suppose you have two OUs—ou=Engineering and ou=Operations—Cloudera Manager will find User "foo" if
it exists in either of these OUs, that is, uid=foo,ou=Engineering,dc=corp,dc=com or
uid=foo,ou=Operations,dc=corp,dc=com.
You can use a user search filter along with a DN pattern, so that the search filter provides a fallback if the
DN pattern search fails.
The Groups filters let you search to determine if a DN or username is a member of a target group. In this
case, the filter you provide can be something like member={0} where {0} will be replaced with the DN of the
user you are authenticating. For a filter requiring the username, {1} may be used, as memberUid={1}. This
will return a list of groups this user belongs to, which will be compared to the list in the LDAP User Groups
and LDAP Administrator Groups, and LDAP Limited Administrator Groups properties (discussed in the section
about Active Directory).
Configuring Cloudera Manager to Use LDAPS instead of LDAP
If the LDAP server certificate has been signed by a trusted Certificate Authority (that is, VeriSign, GeoTrust, and
so on), steps 1 and 2 below may not be necessary.
1. Copy the CA certificate file (ca.cer, etc.) to the Cloudera Manager Server.
2. Import the CA certificate(s) from the CA certificate file to the local keystore. For example:
/usr/java/latest/bin/keytool -import -alias <nt_domain_name> -keystore
/usr/java/latest/jre/lib/security/cacerts -file <path_to_cert>
Note:
• The default password for the cacerts store is changeit.
• The alias can be any name (not just the domain name).
3. Configure the LDAP URL in the Cloudera Manager configuration to use ldaps://<ldap_server> instead of
ldap://<ldap_server>.
Configuring Authentication Using an External Program
You can configure Cloudera Manager to use an external authentication program of your own choosing. Typically,
this may be a custom script that interacts with a custom authentication service. Cloudera Manager will call the
external program with the username as the first command line argument. The password is passed over stdin.
Cloudera Manager assumes the program will return the following exit codes:
•
•
•
•
0 for the successful authentication of a regular user
1 for the successful authentication of an admin user
2 for the successful authentication of a limited admin user
a negative value for failure to authenticate.
1. For External Authentication Type select External Program.
2. Provide a path to the external program in the External Authentication Program Path property.
26 | Cloudera Manager Administration Guide
Managing Users and Authentication
Configuring Authentication Using SAML
Important: This feature is available only with a Cloudera Enterprise license.
For other licenses, the following applies:
• Cloudera Express - the feature is not available.
• Cloudera Enterprise Data Hub Edition Trial - the feature will not be available after you end the
trial or the trial license expires.
To obtain a license for Cloudera Enterprise, please fill in this form or call 866-843-7207. After you
install a Cloudera Enterprise license, the feature will be available.
Cloudera Manager supports the Security Assertion Markup Language (SAML), an XML-based open standard
data format for exchanging authentication and authorization data between parties, in particular, between an
identity provider (IDP) and a service provider (SP). The SAML specification defines three roles: the principal
(typically a user), the IDP, and the SP. In the use case addressed by SAML, the principal (user agent) requests a
service from the service provider. The service provider requests and obtains an identity assertion from the IDP.
On the basis of this assertion, the SP can make an access control decision—in other words it can decide whether
to perform some service for the connected principal.
The primary SAML use case is called web browser single sign-on (SSO). A user wielding a user agent (usually a
web browser) requests a web resource protected by a SAML SP. The SP, wishing to know the identity of the
requesting user, issues an authentication request to a SAML IDP through the user agent. In the context of this
terminology, Cloudera Manager operates as a SP. This topic discusses the Cloudera Manager part of the
configuration process; it assumes that you are familiar with SAML and SAML configuration in a general sense,
and that you have a functioning IDP already deployed.
Note:
• Cloudera Manager supports both SP- and IDP-initiated SSO.
• The logout action in Cloudera Manager will send a single-logout request to the IDP.
• SAML authentication has been tested with specific configurations of SiteMinder and Shibboleth.
While SAML is a standard, there is a great deal of variability in configuration between different
IDP products, so it is possible that other IDP implementations, or other configurations of SiteMinder
and Shibboleth, may not interoperate with Cloudera Manager.
Setting up Cloudera Manager to use SAML requires the following steps.
Preparing Files
You will need to prepare the following files and information, and provide these to Cloudera Manager:
• A Java keystore containing:
– A private key for Cloudera Manager to use to sign/encrypt SAML messages
– Any public certificates needed to verify the sign/encrypt key used by your IDP
• The SAML metadata XML file from your IDP
• The entity ID that should be used to identify the Cloudera Manager instance
• How the user ID is passed in the SAML authentication response:
– As the NameID
– As an attribute. If so, what identifier is used.
• The method by which the Cloudera Manager role will be established:
– From an attribute in the authentication response:
Cloudera Manager Administration Guide | 27
Managing Users and Authentication
– What identifier will be used for the attribute
– What values will be passed to indicate each role
– From an external script that will be called for each use:
– The script takes user ID as $1
– The script sets an exit code to reflect assigned role
–
–
–
–
0 = admin
1 = regular user
2 = limited admin
-1 = failure
Configuring Cloudera Manager
1.
2.
3.
4.
Start the server normally and log in using an Admin account.
Select Administration > Settings.
In the left-hand column, select the External Authentication category.
Set the External Authentication Type property to SAML (the Authentication Backend Order property is ignored
for SAML).
5. Set the Path to SAML IDP Metadata File property to point to the IDP metadata file.
6. Set the Path to SAML Keystore File property to point to the Java keystore prepared earlier.
7. In the SAML Keystore Password property, set the keystore password.
8. In the Alias of SAML Sign/Encrypt Private Key property, set the alias used to identify the private key for
Cloudera Manager to use.
9. In the SAML Sign/Encrypt Private Key Password property, set the private key password.
10. Set the SAML Entity ID property if:
• There is more than one Cloudera Manager instance being used with the same IDP (each instance needs
a different entity ID).
• Entity IDs are assigned by organizational policy.
11. In the Source of user ID in SAML response property, set whether the user ID will be obtained from the NameID
or an attribute.
12. If an attribute will be used, set the attribute name in the SAML attribute identifier for user ID property. The
default value is the normal OID used for user IDs and so may not need to be changed.
13. In the SAML Role assignment mechanism property, set whether the role assignment will be done from an
attribute or an external script.
• If an attribute will be used:
– In the SAML attribute identifier for user role property, set the attribute name if necessary. The default
value is the normal OID used for OrganizationalUnits and so may not need to be changed.
– In the SAML attribute values for roles property, set which attribute values will be used to indicate the
user role.
• If an external script will be used, set the path to that script in the Path to SAML Role assignment script
property. Make sure that the script is executable (an executable binary is fine - it doesn’t need to be a
shell script).
14. Save the changes. Cloudera Manager will run a set of validations that ensure it can find the metadata XML
and the keystore, and that the passwords are correct. If you see a validation error, correct the problem before
proceeding.
15. Restart the Cloudera Manager Server.
28 | Cloudera Manager Administration Guide
Managing Users and Authentication
Configuring the IDP
After the Cloudera Manager Server is restarted, it will attempt to redirect to the IDP login page instead of showing
the normal CM page. This may or may not succeed, depending on how the IDP is configured. In either case, the
IDP will need to be configured to recognize CM before authentication will actually succeed. The details of this
process are specific to each IDP implementation - refer to your IDP documentation for details.
1. Download the Cloudera Manager’s SAML metadata XML file from http://hostname:7180/saml/metadata.
2. Inspect the metadata file and ensure that any URLs contained in the file can be resolved by users’ web
browsers. The IDP will redirect web browsers to these URLs at various points in the process. If the browser
cannot resolve them, authentication will fail. If the URLs are incorrect, you can manually fix the XML file or
set the Entity Base URL in the CM configuration to the right value, and then re-download the file.
3. Provide this metadata file to your IDP using whatever mechanism your IDP provides.
4. Ensure that the IDP has access to whatever public certificates are necessary to validate the private key that
was provided to Cloudera Manager earlier.
5. Ensure that the IDP is configured to provide the User ID and Role using the attribute names that Cloudera
Manager was configured to expect, if relevant.
6. Ensure the changes to the IDP configuration have taken effect (a restart may be necessary).
Verifying Authentication and Authorization
1. Return to the Cloudera Manager Admin Console and refresh the login page.
2. Attempt to log in with credentials for a user that is entitled. The authentication should complete and you
should see the Cloudera Manager Admin Console Home page.
3. If authentication fails, you will see an IDP provided error message. Cloudera Manager is not involved in this
part of the process, and you must ensure the IDP is working correctly to complete the authentication.
4. If authentication succeeds but the user is not authorized to use Cloudera Manager, they will be taken to an
error page by Cloudera Manager that explains the situation. If an user who should be authorized sees this
error, then you will need to verify their role configuration, and ensure that it is being properly communicated
to Cloudera Manager, whether by attribute or external script. The Cloudera Manager log will provide details
on failures to establish a user’s role. If any errors occur during role mapping, Cloudera Manager will assume
the user is unauthorized.
Cloudera Manager Administration Guide | 29
Configuring TLS Security for Cloudera Manager
Configuring TLS Security for Cloudera Manager
Important:
• Cloudera strongly recommends that you set up a fully-functional CDH cluster and Cloudera Manager
before you begin configuring it to use TLS.
• If you want to add new hosts after performing the following procedures to enable TLS, you must
disable TLS and then configure TLS for each new host. For more information, see Adding a Host
to the Cluster.
Transport Layer Security (TLS) provides encryption and authentication in the communications between the
Cloudera Manager Server and Agents. Encryption prevents snooping of communications, and authentication
helps prevent malicious Servers or Agents from causing problems in your cluster. Cloudera Manager supports
three levels of TLS security:
• Level 1 (Good) - Encrypted communications between the Server and Agents only; no authentication of Server
and Agents. See Configuring TLS Encryption only for Cloudera Manager on page 31.
• Level 2 (Better) - Encrypted communications and authentication of Server to Agents and users; no
authentication of Agents to Server. See Configuring TLS Authentication of Server to Agents on page 33.
• Level 3 (Best) - Encrypted communications, authentication of Server to Agents, and authentication of Agents
to Server. See Configuring TLS Authentication of Agents to Server on page 36.
To enable TLS encryption for all connections between your Web browser running the Cloudera Manager Admin
Console and the Cloudera Manager Server, see Configuring TLS Encryption for Cloudera Manager Admin Console
on page 38.
Configuring TLS Encryption only for Cloudera Manager
Use the keytool to manage the public keys and certificates for the Cloudera Manager Server. Before configuring
TLS security for Cloudera Manager, create a keystore, as described in the documentation at the preceding link.
For example, you might use a command similar to the following:
keytool -genkey -alias jetty -keystore truststore
Step 1: Create a Cloudera Manager Server certificate.
Warning: You must use an Oracle JDK keytool.
1. Use keytool to generate a certificate for the Cloudera Manager Server. For example:
$ keytool -validity 180 -keystore <path-to-keystore> -alias jetty -genkeypair
-keyalg RSA
• The -validity option specifies the certificate lifetime in number of days. If no validity value is specified,
the default value is used. The default varies, but is often 90 days.
• The <path-to-keystore> must be a path to where you want to save the keystore file, and where the
Cloudera Manager Server host can access.
2. When prompted by keytool, create a password for the keystore. Save the password in a safe place.
3. When prompted by keytool, fill in the answers accurately to the questions to describe you and your company.
The most important answer is the CN value for the question "What is your first and last name?" The CN must
Cloudera Manager Administration Guide | 31
Configuring TLS Security for Cloudera Manager
match the fully-qualified domain name (FQDN) or IP address of the host where the Server is running. For
example, cmf.company.com or 192.168.123.101.
Important: For the CN value, be sure to use a FQDN if possible, or a static IP address that will not
change. Do not specify an IP address that will change periodically. When Agents connect to the server
using TLS, they check whether the key uses the same name as the one they are using to connect to
the server. If the names do not match, Agents do not heartbeat.
Step 2: Enable TLS encryption and specify Server keystore properties.
1.
2.
3.
4.
Log into the Cloudera Manager Admin Console.
Select Administration > Settings.
Click the Security category.
Configure the following TLS settings:
Setting
Description
Use TLS Encryption for
Agents
Enable TLS encryption between the Server and Agents.
Path to TLS Keystore File The full filesystem path to the keystore file.
Keystore Password
The password for keystore.
5. Click Save Changes to save the settings.
Step 3: Enable and configure TLS on the Agent hosts.
To enable and configure TLS, you must specify values for the TLS properties in the
/etc/cloudera-scm-agent/config.ini configuration file on all Agent hosts.
1. On the Agent host, open the /etc/cloudera-scm-agent/config.ini configuration file:
2. Edit the following property in the /etc/cloudera-scm-agent/config.ini configuration file.
Property
Description
use_tls
Specify 1 to enable TLS on the Agent, or 0 (zero) to disable TLS.
3. Repeat these steps on every Agent host.
Step 4: Restart the Cloudera Manager Server.
Note: Perform this step only if you are using a self-signed server certificate.
Restart the Cloudera Manager Server with the following command to activate the TLS configuration settings.
$ sudo service cloudera-scm-server restart
Step 5: Restart the Cloudera Manager Agents.
On every Agent host, restart the Agent:
$ sudo service cloudera-scm-agent restart
32 | Cloudera Manager Administration Guide
Configuring TLS Security for Cloudera Manager
Step 6: Verify that the Server and Agents are communicating.
In the Cloudera Manager Admin Console, open the Hosts page. If the Agents heartbeat successfully, TLS encryption
is working properly.
Configuring TLS Authentication of Server to Agents
This is the second highest level of TLS security and requires that you provide a server certificate for the Server
that is signed through a chain to a trusted root CA. You must also provide the certificate of the Certificate
Authority (CA) that signed the Server's server certificate. If you are not working in a production environment, you
can also use a self-signed server certificate.
Note: If the Server's server certificate or the associated CA certificate is missing or expired, the Agents
do not allow communications with the Server.
Step 1: Configure TLS encryption.
If you have not already done so, you must configure TLS encryption to use this second level of security. For
instructions, see Configuring TLS Encryption only for Cloudera Manager on page 31.
Step 2: Provide the Server's server certificate and CA certificate.
If you want to use a Certificate Authority-signed server certificate, you can use keytool to request a server
certificate from an existing CA, you can skip down to Using a CA-signed server certificate on page 33. Alternatively,
if you want to generate your own self-signed server certificate, you can use keytool to generate a public
certificate for the Server, see Using a self-signed server certificate on page 35.
Using a CA-signed server certificate
1. Generate a new RSA key:
Use keytool provided by the Java SDK to create a new keystore containing a keypair for the Cloudera Manager
server. Replace the
$ keytool -validity 180 -keystore cm_keystore.jks -alias jetty -genkeypair -keyalg
RSA
Enter keystore password:
Re-enter new password:
What is your first and last name?
[Unknown]: host_1.example.com
What is the name of your organizational unit?
[Unknown]: Support
What is the name of your organization?
[Unknown]: Cloudera
What is the name of your City or Locality?
[Unknown]: London
What is the name of your State or Province?
[Unknown]:
What is the two-letter country code for this unit?
[Unknown]: GB
Is CN=host_1.example.com, OU=Support, O=Cloudera, L=London, ST=Unknown, C=GB
correct?
[no]: yes
Enter key password for <jetty>
(RETURN if same as keystore password):
2. Create a Certificate Signing Request (CSR):
Use the key created in the previous step to create a CSR for the Cloudera Manager server.
$ keytool -certreq -alias jetty -keystore cm_keystore.jks > jetty.csr
Enter keystore password:
Cloudera Manager Administration Guide | 33
Configuring TLS Security for Cloudera Manager
3. Request a new server certificate:
To request a certificate from a recognised Certificate Authority (CA), provide the CSR generated in step 2. The
example below uses a private CA created using OpenSSL.
$ openssl ca -config openssl.cnf -out jetty.crt -infiles jetty.csr
Using configuration from openssl.cnf
Enter pass phrase for cakey.pem:
Check that the request matches the signature
Signature ok
<--SNIP-->
Certificate is to be certified until Apr 19 10:49:41 2024 GMT (3650 days)
Sign the certificate? [y/n]:y
1 out of 1 certificate requests certified, commit? [y/n]y
Write out database with 1 new entries
Data Base Updated
4. Import CA trust certificates into the Cloudera Manager keystore:
Import the root and intermediate CA certificates to the keystore created in step 1. Generate a new RSA key.
$ keytool -import -keystore cm_keystore.jks -alias int_CA -file intermediate.crt
Enter keystore password:
Owner: CN=COE Intermediate Test CA, OU=Customer Operations, O=Cloudera, ST=London,
C=GB
Issuer: CN=COE Root Test CA, OU=Customer Operations, O=Cloudera, L=Shoreditch,
ST=London, C=GB
Serial number: 1
Valid from: Tue Apr 22 02:02:26 PDT 2014 until: Wed Apr 22 02:02:26 PDT 2015
<--SNIP-->
Trust this certificate? [no]: yes
Certificate was added to keystore
$ keytool -import -keystore cm_keystore.jks -alias root_CA -file root.crt
Enter keystore password:
Owner: CN=COE Root Test CA, OU=Customer Operations, O=Cloudera, L=Shoreditch,
ST=London, C=GB
Issuer: CN=COE Root Test CA, OU=Customer Operations, O=Cloudera, L=Shoreditch,
ST=London, C=GB
Serial number: 928f80538cdfe523
Valid from: Tue Apr 22 01:58:53 PDT 2014 until: Sun Apr 21 01:58:53 PDT 2019
<--SNIP-->
Trust this certificate? [no]: yes
Certificate was added to keystore
5. Import certificate into Cloudera Manager keystore:
Import the signed server certificate supplied by the CA, using the same alias that was used to create the key
pair in step 1, so the key and the certificate are linked together in the keystore. Make sure you see the
message, Certificate reply was installed in keystore.
$ keytool -import -keystore cm_keystore.jks -alias jetty -file jetty.crt
Enter keystore password:
Certificate reply was installed in keystore
6. Create trusted keystore:
Create a trusted keystore using the keytool command as in step 1. Generate a new RSA key. Import the CA
intermediate and root certificates to this new keystore, in this case, trusted.jks. Alternatively, you can use
the existing Cloudera Manager keystore, containing the CA intermediate and root certificates, as the trust
store.
$ keytool -import -keystore trusted.jks -alias int_CA -file intermediate.crt
Enter keystore password:
34 | Cloudera Manager Administration Guide
Configuring TLS Security for Cloudera Manager
Owner: CN=COE Intermediate Test CA, OU=Customer Operations, O=Cloudera, ST=London,
C=GB
Issuer: CN=COE Root Test CA, OU=Customer Operations, O=Cloudera, L=Shoreditch,
ST=London, C=GB
Serial number: 1
Valid from: Tue Apr 22 02:02:26 PDT 2014 until: Wed Apr 22 02:02:26 PDT 2015
<--SNIP-->
Trust this certificate? [no]: yes
Certificate was added to keystore
$ keytool -import -keystore trusted.jks -alias root_CA -file root.crt
Enter keystore password:
Owner: CN=COE Root Test CA, OU=Customer Operations, O=Cloudera, L=Shoreditch,
ST=London, C=GB
Issuer: CN=COE Root Test CA, OU=Customer Operations, O=Cloudera, L=Shoreditch,
ST=London, C=GB
Serial number: 928f80538cdfe523
Valid from: Tue Apr 22 01:58:53 PDT 2014 until: Sun Apr 21 01:58:53 PDT 2019
<--SNIP-->
Trust this certificate? [no]: yes
Certificate was added to keystore
Using a self-signed server certificate
1. Use keytool to generate a public certificate for the Server by typing the following command on the Server
host:
$ keytool -validity 180 -keystore <path-to-keystore> -alias jetty -genkeypair
-keyalg RSA
2. When prompted by keytool, create a password for the keystore. Save the password in a safe place.
3. When prompted by keytool, fill in the answers accurately to the questions to describe you and your company.
The most important answer is the CN value for the question "What is your first and last name?" The CN must
match the fully-qualified domain name (FQDN) or IP address of the host where the Server is running. For
example, cmf.company.com or 192.168.123.101.
Important: For the CN value, be sure to use a FQDN if possible, or a static IP address that will not
change. Do not specify an IP address that will change periodically. When agents connect to the
server using TLS, they check whether the key uses the same name as the one they are using to
connect to the server. If the names do not match, agents do not heartbeat.
4. On the Server host, run the following command to export the server certificate from your keystore in the
binary DER format:
$ keytool -exportcert -keystore <path-to-keystore> -alias jetty -file server.der
5. Convert the binary DER format to a .pem file that can be used on the Agents by using openssl (available for
download here.)
$ openssl x509 -out server.pem -in server.der -inform der
Step 3: Copy the Server's server .pem file to the Agents.
1. Copy the Server's server .pem file (for example, server.pem) to the Agent host in any directory. If you have
used a CA-signed certificate, copy the CA's root certificate in PEM format to the Agent host. For example, copy
the .pem file to /etc/cmf.
2. On the Agent host, open the /etc/cloudera-scm-agent/config.ini configuration file and edit the following
properties in the /etc/cloudera-scm-agent/config.ini configuration file.
Cloudera Manager Administration Guide | 35
Configuring TLS Security for Cloudera Manager
Property
Description
verify_cert_file
Enter the path to the Server's server.pem file. For example,
/etc/cmf/server.pem.
use_tls
Set this property to 1.
3. Repeat these steps on every Agent host.
Step 4: Enable TLS Encryption in Cloudera Manager.
1.
2.
3.
4.
Log into the Cloudera Manager Admin Console.
Select Administration > Settings.
Click the Security category.
Configure the following TLS setting:
Setting
Description
Use TLS Encryption for
Agents
Enable TLS encryption between the Server and Agents.
5. Click Save Changes to save the settings.
Step 5: Restart the Cloudera Manager Server.
$ sudo service cloudera-scm-server restart
Step 6: Restart the Cloudera Manager Agents.
On every Agent host, restart the Agent:
$ sudo service cloudera-scm-agent restart
Step 7: Verify that the Server and Agents are communicating.
In the Cloudera Manager Admin Console, open the Hosts page. If the Agents heartbeat successfully, the Server
and Agents are communicating. If not, check the Agent log
/var/log/cloudera-scm-agent/cloudera-scm-agent.log which shows errors if the connection fails.
Configuring TLS Authentication of Agents to Server
This is the highest level of TLS security and requires you to use openssl to create private keys and public
certificates for every Agent on your cluster, and import those Agents' certificates into the Server's truststore.
Step 1: Configure TLS encryption.
If you have not already done so, you must configure TLS encryption to use this third level of security. For
instructions, see Configuring TLS Encryption for Cloudera Manager.
Step 2: Configure TLS Authentication of Server to Agents.
If you have not already done so, you must configure TLS Authentication of Server to Agents. For instructions,
see Configuring TLS Authentication of Server to Agents.
36 | Cloudera Manager Administration Guide
Configuring TLS Security for Cloudera Manager
Step 3. Generate the private key for the Agent using openssl.
1. Run the following openssl command on the agent:
$ openssl genrsa -des3 -out agent.key
2. Provide a password for the key file. Note it in a safe place.
Step 4: Generate a certificate for the agent.
1. Run the following openssl command.
$ openssl req -new -x509 -days 365 -key agent.key -out agent.pem
The key is output in a .pem file. In the preceding example, the optional days argument results in a certificate
that is valid for 365 days.
2. Fill in the answers to the questions about the certificate. Note that the CN must match the hostname or IP
address of the Agent host.
Step 5: Create a file that contains the password for the key.
The Agent reads the password from a text file instead of from a command line. The file allows you to use file
permissions to protect the password. For example, name the file agent.pw.
Step 6: Configure the Agent with its private key and certificate.
1. On the Agent host, open the /etc/cloudera-scm-agent/config.ini configuration file:
2. Edit the following properties in the /etc/cloudera-scm-agent/config.ini configuration file.
Property
Description
client_key_file
Name of client key file
client_keypw_file
Name of client key pw file
client_cert_file
Name of client certificate file
3. Repeat these steps on every Agent host.
Step 7: Import the Agent's certificate into the Server's truststore.
The Server's truststore contains the certificates that are required to authenticate clients. Use the following
command to import a certificate called, for example, agent.pem into a new truststore called, for example,
truststore.
$ keytool -keystore <path-to-truststore> -import -alias <agent-name> -file agent.pem
Step 8: Repeat steps 3 through 7 for every agent in your cluster.
Important: Each Agent's private key and certificate that you import into the Server's truststore must
be unique.
Step 9: Enable Agent authentication and configure the Server to use the new truststore.
1. Log into the Cloudera Manager Admin Console.
2. Select Administration > Settings.
3. Click the Security category.
Cloudera Manager Administration Guide | 37
Configuring TLS Security for Cloudera Manager
4. Configure the following TLS settings:
Setting
Description
Use TLS Authentication of
Agents to Server
Select this option to enable TLS Authentication of Agents to the Server.
Path to Truststore
Specify the full filesystem path to the truststore located on the Cloudera
Manager Server host.
Truststore Password
Specify the password for the truststore.
5. Click Save Changes to save the settings.
Step 10: Restart the Server.
$ sudo service cloudera-scm-server restart
Step 11: Restart the Cloudera Manager Agents.
On every Agent host, restart the Agent:
$ sudo service cloudera-scm-agent restart
Step 12: Verify that the Server and Agents are communicating.
In Cloudera Manager Admin Console, open the Hosts page. If the Agents heartbeat successfully, the Server and
Agents are communicating. If they are not, you may get an error in the Server, such as a null CA chain error.
This implies either the truststore doesn't contain the Agent certificate or the Agent isn't presenting the certificate.
Double check all of your settings. Check the Server's log to verify whether TLS and Agent validation have been
enabled correctly.
Configuring TLS Encryption for Cloudera Manager Admin Console
This level of security is for users connecting to the Cloudera Manager Admin console.
Step 1: Create a Cloudera Manager Server certificate.
Note: If you have already completed this step when configuring TLS encryption for Cloudera Manager,
you do not need to repeat it.
Warning: You must use an Oracle JDK keytool.
1. Use keytool to generate a certificate for the Cloudera Manager Server. For example:
$ keytool -validity 180 -keystore <path-to-keystore> -alias jetty -genkeypair
-keyalg RSA
• The -validity option specifies the certificate lifetime in number of days. If no validity value is specified,
the default value is used. The default varies, but is often 90 days.
• The <path-to-keystore> must be a path to where you want to save the keystore file, and where the
Cloudera Manager Server host can access.
2. When prompted by keytool, create a password for the keystore. Save the password in a safe place.
38 | Cloudera Manager Administration Guide
Configuring TLS Security for Cloudera Manager
3. When prompted by keytool, fill in the answers accurately to the questions to describe you and your company.
The most important answer is the CN value for the question "What is your first and last name?" The CN must
match the fully-qualified domain name (FQDN) or IP address of the host where the Server is running. For
example, cmf.company.com or 192.168.123.101.
Important: For the CN value, be sure to use a FQDN if possible, or a static IP address that will not
change. Do not specify an IP address that will change periodically. When agents connect to the server
using TLS, they check whether the key uses the same name as the one they are using to connect to
the server. If the names do not match, agents do not heartbeat.
Step 2: Enable TLS encryption and specify Server keystore properties.
1. Log into the Cloudera Manager Admin Console.
2. From the Administration tab select Settings, then go to the Security category.
3. Configure the following three TLS settings:
Setting
Description
Use TLS Encryption for Admin Console
Select this option to enable TLS encryption between the Server
and user's web browser.
Path to TLS Keystore File
Specify the full filesystem path to the keystore file.
Keystore Password
Specify the password for keystore.
4. Click Save Changes to save the settings.
Step 3: Restart the Cloudera Manager Server.
Restart the Cloudera Manager Server with the following command to activate the TLS configuration settings.
$ sudo service cloudera-scm-server restart
Log out and then log in into Cloudera Manager to test the certificate. You may see an warning message to accept
the certificate if the root certificate is not installed in your browser.
Step 4: Restart the Cloudera Management Services.
Restart the Cloudera Management Services by clicking the Services link and choosing Restart on the Actions
menu for the Cloudera Management Services. Click Restart that appears in the next screen to confirm. When
you see a Finished status, the service has restarted.
Step 5: Verify that the Server and browser are using TLS to communicate.
Open the Cloudera Manager Admin Console page in your browser. Every browser has its own way of indicating
a successful TLS connection. Some browsers indicate this by displaying a lock icon in the URL bar while others
display an error message if the connection is unencrypted.
Cloudera Manager Administration Guide | 39
Upgrading Cloudera Manager
Upgrading Cloudera Manager
Upgrading Cloudera Manager preserves existing data and settings, while enabling the use of the new features
provided with the latest product versions. To enable new features, some new settings are added, and some
additional steps may be required, but no existing configuration is removed.
Note: When an upgraded Cloudera Manager adds support for a new feature (for example, Sqoop 2,
WebHCat, and so on), it does not install the software on which the new feature depends. If you install
CDH and managed services from packages, you must add the packages to your managed hosts first,
before adding a service or role that supports the new feature.
Understanding Upgrades
The process for upgrading Cloudera Manager varies depending on the starting point. The categories of tasks to
be completed include the following:
• Install any databases required for the release. In Cloudera Manager 5, the Host Monitor and Service Monitor
roles use an internal database that provides greater capacity and flexibility for current and future uses. You
no longer need to configure an external database for this purpose. If you are upgrading from Cloudera Manager
4, this transition is handled automatically. If you are upgrading a Free Edition installation and you are running
a MapReduce service, you are asked to configure an additional database for the Activity Monitor that is part
of Cloudera Express.
• Upgrade the Cloudera Manager Server.
• Upgrade the Cloudera Manager Agent. This can be done via an upgrade wizard that is invoked when you
connect to the Admin Console or by manually installing the Cloudera Manager Agent packages.
Upgrading Cloudera Manager
You can upgrade from any version of Cloudera Manager 4 running CDH 4, to Cloudera Manager 5 or from Cloudera
Manager 5 to a later version of Cloudera Manager 5. See the instructions at:
• Upgrading Cloudera Manager 5 to the Latest Cloudera Manager on page 44. After upgrading Cloudera Manager
5, the following is true:
– Database schema is upgraded to reflect the current version.
– The Cloudera Manager Server and all supporting services are updated.
– Client configurations are redeployed to ensure client services have the most current configuration.
• Upgrading Cloudera Manager 4 to Cloudera Manager 5 on page 50 and Upgrading Cloudera Manager 3.7.x
on page 61. After upgrading to Cloudera Manager 5, the following is true:
– Database schema is upgraded to reflect the current version. Data from the existing Host and Service
Monitor databases is migrated.
– The Cloudera Manager Server and all supporting services are updated.
– Client configurations are redeployed to ensure client services have the most current configuration.
– Cloudera Manager 5 continues to support a CDH 4 cluster with an existing High Availability deployment
using NFS shared edits directories. However, if you disable High Availability in Cloudera Manager 5, you
will only be able to re-enable High Availability using Quorum-based Storage. CDH 5 does not support
enabling NFS shared edits directories with High Availability.
Upgrading CDH
Cloudera Manager 5 can manage both CDH 4 and CDH 5, so upgrading existing CDH 4 installations is not required.
However, to get the benefits of the most current CDH features, you may want to upgrade CDH. See the following
topics for more information on upgrading CDH:
Cloudera Manager Administration Guide | 41
Upgrading Cloudera Manager
• Upgrading CDH 4 - Follow this path to upgrade existing installations of CDH 4 to the latest version of CDH
4.
• Upgrading to CDH 5 - Follow this path to upgrade existing installations of CDH 4 to CDH 5.
Database Considerations for Cloudera Manager Upgrades
Cloudera Manager uses databases to store information about system configurations and tasks. Before upgrading,
complete the pre-upgrade database tasks that apply in your environment.
Note:
Cloudera Manager 4.5 added support for Hive, which includes the Hive Metastore Server role type.
This role manages the Metastore process when Hive is configured with a remote Metastore.
When upgrading from Cloudera Manager prior to 4.5, Cloudera Manager automatically creates new
Hive service(s) to capture the previous implicit Hive dependency from Hue and Impala. Your previous
services will continue to function without impact. If Hue was using a Hive Metastore backed by a
Derby database, then the newly created Hive Metastore Server will also use Derby. Since Derby does
not allow concurrent connections, Hue will continue to work, but the new Hive Metastore Server will
fail to run. The failure is harmless (because nothing uses this new Hive Metastore Server at this point)
and intentional, to preserve the set of cluster functionality as it was before upgrade. Cloudera
discourages the use of a Derby backed Hive Metastore due to its limitations. You should consider
switching to a different supported database.
After you have completed these steps, the upgrade processes automatically complete any additional updates
to database schema and service data stored. You do not need to complete any data migration.
Back up Databases
Before beginning the upgrade process, shut down the services that are using databases. This includes the
Cloudera Manager Management Service roles, the Hive Metastore server, and Cloudera Navigator, if it is in use.
Cloudera strongly recommends that you then back up all databases, however backing up the Activity Monitor
database is optional. This is especially important if you are upgrading from Cloudera Manager 4 to Cloudera
Manager 5. For information on backing up databases see Backing up Databases on page 16.
If any additional database will be required as a result of the upgrade, complete any required preparatory work
to install and configure those databases. For example, if you are upgrading from Cloudera Manager Free Edition,
Cloudera Manager 5 with Cloudera Express requires a database for the Activity Monitor. The upgrade instructions
assume all required databases have been prepared. For more information on using databases, see Cloudera
Manager and Managed Service Databases.
Modify Databases to Support UTF-8
Cloudera Manager 4.0 adds support for UTF-8 character sets. Update any existing databases in your environment
that are not configured to support UTF-8.
Modifying MySQL to Support UTF-8
To modify a MySQL database to support UTF-8, the default character set must be changed and then you must
restart the mysql service. Use the following commands to complete these tasks:
mysql> alter database default character set utf8;
mysql> quit
$ sudo service mysql restart
42 | Cloudera Manager Administration Guide
Upgrading Cloudera Manager
Modifying PostgreSQL to Support UTF-8
There is no single command available to modify an existing PostgreSQL database to support UTF-8. As a result,
you must complete the following process:
1. Use pg_dump to export the database to a file. This creates a backup of the database that you will import into
a new, empty database that supports UTF-8.
2. Drop the existing database. This deletes the existing database.
3. Create a new database that supports Unicode encoding and that has the same name as the old database.
Use a command of the following form, replacing the database name and user name with values that match
your environment:
CREATE DATABASE scm_database WITH OWNER scm_user ENCODING 'UTF8'
4. Review the contents of the exported database for non-standard characters. If you find unexpected characters,
modify these so the database backup file contains the expected data.
5. Import the database backup to the newly created database.
Modifying Oracle to Support UTF-8
Work with your Oracle database administrator to ensure any Oracle databases support UTF-8.
Modify Databases to Support Appropriate Maximum Connections
Check existing databases configurations to ensure the proper maximum number of connections is supported.
Update the maximum configuration values, as required.
Modify the Maximum Number of MySQL Connections
Allow 100 maximum connections for each database and then add 50 extra connections. For example, for two
databases set the maximum connections to 250. If you store five databases on one host (the databases for
Cloudera Manager Server, Activity Monitor, Reports Manager, Cloudera Navigator, and Hive Metastore), set the
maximum connections to 550.
Modify the Maximum Number of PostgreSQL Connections
Update the max_connection parameter in the /etc/postgresql.conf file.
You may have to increase the system resources available to PostgreSQL, as described at
http://www.postgresql.org/docs/9.1/static/kernel-resources.html.
Modify the Maximum Number of Oracle Connections
Work with your Oracle database administrator to ensure appropriate values are applied for your Oracle database
settings. You must determine the number of connections, transactions, and sessions to be allowed.
Allow 100 maximum connections for each database and then add 50 extra connections. For example, for two
databases set the maximum connections to 250. If you store five databases on one host (the databases for
Cloudera Manager Server, Activity Monitor, Reports Manager, Cloudera Navigator, and Hive Metastore), set the
maximum connections to 550.
From the maximum number of connections, you can determine the number of anticipated sessions using the
following formula:
sessions = (1.1 * maximum_connections) + 5
For example, if a host has two databases, you anticipate 250 maximum connections. If you anticipate a maximum
of 250 connections, plan for 280 sessions.
Cloudera Manager Administration Guide | 43
Upgrading Cloudera Manager
Once you know the number of sessions, you can determine the number of anticipated transactions using the
following formula:
transactions = 1.1 * sessions
Continuing with the previous example, if you anticipate 280 sessions, you can plan for 308 transactions.
Work with your Oracle database administrator to apply these derived values to your system.
Using the sample values above, Oracle attributes would be set as follows:
alter system set processes=250;
alter system set transactions=308;
alter system set sessions=280;
Next Steps
After you have completed any required database preparatory tasks, continue to Upgrading Cloudera Manager
4 to Cloudera Manager 5 on page 50.
Upgrading Cloudera Manager 5 to the Latest Cloudera Manager
This process applies to upgrading all versions of Cloudera Manager 5.
In most cases it is possible to complete the following upgrade without shutting down most CDH services, although
you may need to stop some dependent services. CDH daemons can continue running, unaffected, while Cloudera
Manager is upgraded. The upgrade process does not affect your CDH installation. After upgrading Cloudera
Manager you may also want to upgrade CDH 4 clusters to CDH 5.
Upgrading from a version of Cloudera Manager 5 to the latest version of Cloudera Manager involves the following
broad steps.
Review Warning
Warning: If you have enabled auditing with Cloudera Navigator, during the process of upgrading
Cloudera Manager 5 auditing is suspended and is only restarted when you restart the roles of audited
services.
Perform Prerequisite Steps
Ensure that you have performed the following steps:
• Obtain host credentials - You must have SSH access and be able to log in using a root account or an account
that has password-less sudo permission. See Cloudera Manager Requirements for more information.
• Stop running commands - Use the Admin Console to check for any running commands. You can either wait
for commands to complete or abort any running commands. For more information on viewing and aborting
running commands, see Viewing Running and Recent Commands.
• Prepare databases - See Database Considerations for Cloudera Manager Upgrades on page 42.
Stop Selected Services
If your cluster meets any of the conditions listed in the following table, you must stop the indicated services or
roles.
Condition
Procedure
Running a version of Cloudera Manager that has the
Cloudera Management Service
Stop the Cloudera Management Service.
44 | Cloudera Manager Administration Guide
Upgrading Cloudera Manager
Condition
Procedure
Running Cloudera Navigator
Stop any of the following roles whose service's Queue
Policy configuration
(navigator.batch.queue_policy) is set to
SHUTDOWN:
•
•
•
•
HDFS - NameNode
HBase - Master and RegionServers
Hive - HiveServer2
Hue - Beeswax Server
Stopping these roles renders any service depending on
these roles unavailable. For the HDFS - NameNode
case this implies most of the services in the cluster will
be unavailable until the upgrade is finished.
Stop Cloudera Manager Server, Database, and Agent
1. On the host running the Cloudera Manager Server, stop the Cloudera Manager Server:
$ sudo service cloudera-scm-server stop
2. If you are using the embedded PostgreSQL database for Cloudera Manager, stop the database:
$ sudo service cloudera-scm-server-db stop
Important: If you are not running the embedded database service and you attempt to stop it, you
will get a message to the effect that the service cannot be found. If instead you get a message
that the shutdown failed, this means the embedded database is still running, probably due to
services connected to the Hive Metastore. Do not proceed with the installation until you have
stopped all your Metastore-dependent services and the database successfully shuts down (restart
the Cloudera Manager server to shut down services as necessary). If you continue without solving
this, your upgrade will fail and you will be left with a non-functional Cloudera Manager installation.
3. If the Cloudera Manager host is also running the Cloudera Manager Agent, stop the Cloudera Manager Agent:
$ sudo service cloudera-scm-agent stop
(Optional) Upgrade JDK on Cloudera Manager Server Host and Agent Hosts
If you are manually upgrading the Cloudera Manager Agent packages in Upgrade Cloudera Manager Agent
Packages on page 48, and you plan to upgrade to CDH 5, ensure that JDK1.7u45 is installed on the Agent hosts
following the instructions in Java Development Kit Installation.
If you are not running Cloudera Manager Server on the same host as a Cloudera Manager Agent, and you want
all hosts to run the same JDK version, optionally install JDK1.7u45 on that host.
Upgrade Cloudera Manager Server Packages
1. To upgrade the Cloudera Manager Server Packages, you can upgrade from Cloudera's repository at
http://archive.cloudera.com/cm5/ or you can create your own repository, as described in Understanding
Custom Installation Solutions. Creating your own repository is necessary if you are upgrading a cluster that
does not have access to the Internet.
a. Find the Cloudera repo file for your distribution by starting at http://archive.cloudera.com/cm5/
and navigating to the directory that matches your operating system.
Cloudera Manager Administration Guide | 45
Upgrading Cloudera Manager
For example, for Red Hat or CentOS 6, you would navigate to
http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/. Within that directory, find the repo file
that contains information including the repository's base URL and GPG key. The contents of the
cloudera-manager.repo file might appear as follows:
[cloudera-manager]
# Packages for Cloudera Manager, Version 5, on RedHat or CentOS 6 x86_64
name=Cloudera Manager
baseurl=http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/
gpgkey = http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/RPM-GPG-KEY-cloudera
gpgcheck = 1
For Ubuntu or Debian systems, the repo file can be found by navigating to the appropriate release directory,
for example, http://archive.cloudera.com/cm4/debian/wheezy/amd64/cm. The repo file, in this
case, cloudera.list, may appear as follows:
# Packages for Cloudera Manager, Version 5, on Debian 7.0 x86_64
deb http://archive.cloudera.com/cm5/debian/wheezy/amd64/cm wheezy-cm5 contrib
deb-src http://archive.cloudera.com/cm5/debian/wheezy/amd64/cm wheezy-cm5 contrib
b. Replace the repo file in the configuration location for the package management software for your system.
Operating System
Commands
RHEL
Copy cloudera-manager.repo to /etc/yum.repos.d/.
SLES
Copy cloudera-manager.repo to /etc/zypp/repos.d/.
Ubuntu or Debian
Copy cloudera.list to /etc/apt/sources.list.d/.
c. Run the following commands:
Operating System
Commands
RHEL
$ sudo yum clean all
$ sudo yum upgrade 'cloudera-*'
Note:
• yum clean all cleans up yum's cache directories, ensuring that you
download and install the latest versions of the packages
• If your system is not up to date, and any underlying system
components need to be upgraded before this yum update can succeed.
yum will tell you what those are.
SLES
$ sudo zypper clean --all
$ sudo zypper up -r
http://archive.cloudera.com/cm5/sles/11/x86_64/cm/5/
To download from your own repository:
$ sudo zypper clean --all
$ sudo zypper rr cm
$ sudo zypper ar -t rpm-md
http://myhost.example.com/<path_to_cm_repo>/cm
$ sudo zypper up -r http://myhost.example.com/<path_to_cm_repo>
Ubuntu or Debian
Use the following commands to clean cached repository information and update
Cloudera Manager components:
$ sudo apt-get clean
$ sudo apt-get update
$ sudo apt-get dist-upgrade
$ sudo apt-get install cloudera-manager-server
cloudera-manager-agent cloudera-manager-daemons
46 | Cloudera Manager Administration Guide
Upgrading Cloudera Manager
Operating System
Commands
As this process proceeds, you may be prompted concerning your configuration file
version:
Configuration file `/etc/cloudera-scm-agent/config.ini'
==> Modified (by you or by a script) since installation.
==> Package distributor has shipped an updated version.
What would you like to do about it ? Your options are:
Y or I : install the package maintainer's version
N or O : keep your currently-installed version
D : show the differences between the versions
Z : start a shell to examine the situation
The default action is to keep your current version.
You will receive a similar prompt for /etc/cloudera-scm-server/db.properties.
Answer N to both these prompts. The config.ini file should be carefully inspected
and the files merged together to ensure the new entries are incorporated.
At the end of this process you should have the following packages, corresponding to the version of Cloudera
Manager you installed, on the host that will become the Cloudera Manager Server host.
OS
Packages
RPM-based distributions
Ubuntu or Debian
$ rpm -qa 'cloudera-manager-*'
cloudera-manager-agent-5.0.7-0.cm5.p0.932.el6.x86_64
cloudera-manager-server-5.0.7-0.cm5.p0.932.el6.x86_64
cloudera-manager-daemons-5.0.7-0.cm5.p0.932.el6.x86_64
~# dpkg-query -l 'cloudera-manager-*'
Desired=Unknown/Install/Remove/Purge/Hold
|
Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name
Version
Description
+++-======================-======================-============================================================
ii cloudera-manager-agent 5.0.7-0.cm5.p0.932~sq The Cloudera
Manager Agent
ii cloudera-manager-daemo 5.0.7-0.cm5.p0.932~sq Provides daemons
for monitoring Hadoop and related tools.
ii cloudera-manager-serve 5.0.7-0.cm5.p0.932~sq The Cloudera
Manager Server
You may also see an entry for the cloudera-manager-server-db-2 if you are using the embedded database,
and additional packages for plug-ins, depending on what was previously installed on the server host. If the
cloudera-manager-server-db-2 package is installed, and you don't plan to use the embedded database, you
can remove this package.
Start the Cloudera Manager Server
On the Cloudera Manager Server host (the system on which you installed the cloudera-manager-server
package) do the following:
1. If you are using the embedded PostgreSQL database for Cloudera Manager, start the database:
$ sudo service cloudera-scm-server-db start
2. Start the Cloudera Manager Server:
$ sudo service cloudera-scm-server start
Cloudera Manager Administration Guide | 47
Upgrading Cloudera Manager
You should see the following:
Starting cloudera-scm-server:
[
OK
]
Note: If you have problems starting the server, such as database permissions problems, you can use
the server's log /var/log/cloudera-scm-server/cloudera-scm-server.log to troubleshoot the
problem.
Upgrade Cloudera Manager Agent Packages
Important: All hosts in the cluster must have access to the Internet if you plan to use
archive.cloudera.com as the source for installation files. If you do not have Internet access, create
a custom repository.
1. Log in to the Cloudera Manager Admin Console.
2. Upgrade hosts using one of the following methods:
• Cloudera Manager installs Agent software
1. Select Yes, I would like to upgrade the Cloudera Manager Agent packages now and click Continue.
2. Select the release of the Cloudera Manager Agent to install. Normally, this will be the Matched Release
for this Cloudera Manager Server. However, if you used a custom repository (that is, a repository other
than archive.cloudera.com) for the Cloudera Manager server, select Custom Repository and provide
the required information. The custom repository allows you to use an alternative location, but that
location must contain the matched Agent version. Click Continue to proceed to the Configure Java
Encryption screen.
3. If local laws permit you to deploy unlimited strength encryption, and you want to run a secure cluster,
check the Install Java Unlimited Strength Encryption Policy Files checkbox to install unlimited strength
policy files for Java. This is required because during upgrade Cloudera Manager installs a copy of the
Java 7 JDK, which does not include the unlimited strength policy files. Click Continue to proceed to the
Upgrade Cloudera Manager Agent Packages screen.
4. Specify credentials and initiate Agent installation:
a. Select root or enter the user name for an account that has password-less sudo permission.
b. Select an authentication method:
• If you choose to use password authentication, enter and confirm the password.
• If you choose to use public-key authentication provide a passphrase and path to the required
key files.
c. You can choose to specify an alternate SSH port. The default value is 22.
d. You can specify the maximum number of host installations to run at once. The default value is 10.
5. Click Continue. The Cloudera Manager Agent packages are installed.
• Manually install Agent software. On all cluster hosts except the Cloudera Manager server host:
1. Select No, I would like to skip the agent upgrade now and click Continue.
2. Copy the appropriate repo file as described in step 3 of Upgrade Cloudera Manager Server Packages
on page 54.
3. Run the following commands:
Operating System Commands
RHEL
48 | Cloudera Manager Administration Guide
$ sudo yum clean all
$ sudo yum upgrade 'cloudera-*'
Upgrading Cloudera Manager
Operating System Commands
Note:
• yum clean all cleans up yum's cache directories, ensuring that
you download and install the latest versions of the packages
• If your system is not up to date, and any underlying system
components need to be upgraded before this yum update can
succeed. yum will tell you what those are.
SLES
$ sudo zypper clean --all
$ sudo zypper up -r
http://archive.cloudera.com/cm5/sles/11/x86_64/cm/5/
To download from your own repository:
$ sudo zypper clean --all
$ sudo zypper rr cm
$ sudo zypper ar -t rpm-md
http://myhost.example.com/<path_to_cm_repo>/cm
$ sudo zypper up -r http://myhost.example.com/<path_to_cm_repo>
Ubuntu or Debian
Use the following commands to clean cached repository information and update
Cloudera Manager components:
$ sudo apt-get clean
$ sudo apt-get update
$ sudo apt-get dist-upgrade
$ sudo apt-get install cloudera-manager-agent
cloudera-manager-daemons
As this process proceeds, you may be prompted concerning your configuration
file version:
Configuration file `/etc/cloudera-scm-agent/config.ini'
==> Modified (by you or by a script) since installation.
==> Package distributor has shipped an updated version.
What would you like to do about it ? Your options are:
Y or I : install the package maintainer's version
N or O : keep your currently-installed version
D : show the differences between the versions
Z : start a shell to examine the situation
The default action is to keep your current version.
You will receive a similar prompt for
/etc/cloudera-scm-server/db.properties. Answer N to both these prompts.
The config.ini file should be carefully inspected and the files merged together
to ensure the new entries are incorporated.
4. If local laws permit you to deploy unlimited strength encryption, and you want to run a secure cluster,
install the unlimited strength JCE policy files. This is required because during upgrade Cloudera Manager
installs a copy of the Java 7 JDK, which does not include the unlimited strength policy files.
3. Click Continue. The Host Inspector runs to inspect your managed hosts for correct versions and configurations.
If there are problems, you can make changes and then re-run the inspector. When you are satisfied with the
inspection results, click Continue and then click Finish.
4. Review the configuration changes to be applied. Confirm the settings entered for file system paths. The file
paths required vary based on the services to be installed. For example, you might confirm the NameNode
Data Directory and the DataNode Data Directory for HDFS.
Warning: DataNode data directories should not be placed on NAS devices.
Click Continue. The wizard starts the services.
Cloudera Manager Administration Guide | 49
Upgrading Cloudera Manager
5. Click Finish.
All services (except for the services you stopped in Stop Selected Services on page 44) should be running.
Verify the Upgrade Succeeded
If the commands to update and start the Cloudera Manager Server complete without errors, you can assume
the upgrade has completed as desired. For additional assurance, you can check that the server versions have
been updated.
1. In the Cloudera Manager Admin Console, click the Hosts tab.
2. Click Host Inspector. On large clusters, the host inspector may take some time to finish running. You must
wait for the process to complete before proceeding to the next step.
3. Click Show Inspector Results. All results from the host inspector process are displayed including the currently
installed versions. If this includes listings of current component versions, the installation completed as
expected.
Start Selected Services
Start services you shut down in Stop Selected Services on page 44:
1. Do one of the following, depending on which services you shut down:
•
•
From the Home page click
next to the cluster name and select Start.
From the Home page click
next to the name of each service you shut down and select Start.
2. In the confirmation dialog that displays, click Start.
Deploy Updated Client Configurations
The services whose client configurations require redeployment are indicated with
icon on the Home page
Status tab. To ensure clients have current information about resources, update client configuration:
1.
From the Home page click
next to the cluster name and select Deploy Client Configuration.
2. In the confirmation dialog that displays, click Deploy Client Configuration.
Test the Installation
When you have finished the upgrade to Cloudera Manager, you can test the installation to verify that the
monitoring features are working as expected; follow instructions under Testing the Installation.
(Optional) Upgrade CDH
Cloudera Manager 5 can manage both CDH 4 and CDH 5, so upgrading existing CDH 4 installations is not required,
but you may want to upgrade to the latest version. For more information on upgrading CDH, see Upgrading CDH
and Managed Services.
Upgrading Cloudera Manager 4 to Cloudera Manager 5
This process applies to upgrading all versions of Cloudera Manager 4 to Cloudera Manager 5.
In most cases it is possible to complete the following upgrade without shutting down most CDH services, although
you may need to stop some dependent services. CDH daemons can continue running, unaffected, while Cloudera
Manager is upgraded. The upgrade process does not affect your CDH installation. However, to take advantage
of Cloudera Manager 5 features, after the upgrade all services will have to be restarted. After upgrading Cloudera
Manager you may also want to upgrade CDH 4 clusters to CDH 5.
50 | Cloudera Manager Administration Guide
Upgrading Cloudera Manager
Upgrading from a version of Cloudera Manager 4 to the latest version of Cloudera Manager involves the following
broad steps.
Review Warnings and Notes
Warning:
• Cloudera Management Service databases
Cloudera Manager 5 stores Host Monitor and Service Monitor data in a local datastore instead of
in an embedded PostgreSQL or external database. The Cloudera Manager upgrade process
automatically migrates data from existing databases to the local datastore. For further information,
see Data Storage for Monitoring Data.
The Host Monitor and Service Monitor databases are stored on the partition hosting /var. Ensure
that you have at least 20 GB available on this partition.
If you have been storing the data in an external database, you can drop those databases after
upgrade completes.
• Impala
Cloudera Manager 5 supports Impala 1.2.1 or later. If the version of your Impala service is 1.1 or
earlier, the following upgrade instructions will work, but once the upgrade has completed, you will
see a validation warning for your Impala service and you will not be able to restart your Impala
(or Hue) services until you upgrade your Impala service to 1.2.1 or later. If you want to continue
to use Impala 1.1 or earlier, do not upgrade to Cloudera Manager 5.
• Navigator
If you have enabled auditing with Cloudera Navigator, during the process of upgrading to Cloudera
Manager 5 auditing is suspended and is only restarted when you restart the roles of audited
services.
• Hard Restart of Cloudera Manager Agents
Certain circumstances will require that you hard restart the Cloudera Manager Agent on each host:
• To deploy a fix to an issue where Cloudera Manager didn't always correctly restart services
• To take advantage of the maximum file descriptor feature
• To enable HDFS DataNodes to start if you plan to perform the step (Optional) Upgrade CDH on
page 61 after upgrading Cloudera Manager
Cloudera Manager Administration Guide | 51
Upgrading Cloudera Manager
Important:
• Hive
Cloudera Manager 4.5 added support for Hive, which includes the Hive Metastore Server role type.
This role manages the Metastore process when Hive is configured with a remote Metastore.
When upgrading from Cloudera Manager prior to 4.5, Cloudera Manager automatically creates
new Hive service(s) to capture the previous implicit Hive dependency from Hue and Impala. Your
previous services will continue to function without impact. If Hue was using a Hive Metastore
backed by a Derby database, then the newly created Hive Metastore Server will also use Derby.
Since Derby does not allow concurrent connections, Hue will continue to work, but the new Hive
Metastore Server will fail to run. The failure is harmless (because nothing uses this new Hive
Metastore Server at this point) and intentional, to preserve the set of cluster functionality as it
was before upgrade. Cloudera discourages the use of a Derby backed Hive Metastore due to its
limitations. You should consider switching to a different supported database.
Cloudera Manager provides a Hive configuration option to bypass the Hive Metastore Server. When
this configuration is enabled, Hive clients, Hue, and Impala connect directly to the Hive Metastore
database. Prior to Cloudera Manager 4.5, Hue and Impala connected directly to the Hive Metastore
database, so the bypass mode is enabled by default when upgrading to Cloudera Manager 4.5 or
later. This is to ensure the upgrade doesn't disrupt your existing setup. You should plan to disable
the bypass mode, especially when using CDH 4.2 or later. Using the Hive Metastore Server is the
recommended configuration and the WebHCat Server role requires the Hive Metastore Server to
not be bypassed. To disable bypass mode, see Disabling Bypass Mode.
Cloudera Manager 4.5 or later also supports HiveServer2 with CDH 4.2. In CDH 4 HiveServer2 is
not added by default, but can be added as a new role under the Hive service (see Role Instances).
In CDH 5, HiveServer2 is a mandatory role.
Note:
• When you install on EC2 using the Cloud wizard, the wizard creates a security group that by default
opens ports used by Cloudera Manager and CDH components. Before upgrading, you must manually
open these ports:
– Upgrades from Cloudera Manager 4.7.2 or earlier - 7185 for the Cloudera Manager Event Server.
– Upgrades from Cloudera Manager 5.0.0 beta 2 or earlier - 18080 and 18081 for the Spark master
and worker web UI ports.
• If you are upgrading from Cloudera Manager Free Edition (version 4.5 or earlier) you are upgraded
to Cloudera Express, which includes a number of features that were previously available only with
Cloudera Enterprise. Of those features, activity monitoring requires a database. Thus, upon
upgrading to Cloudera Manager 5, you must specify activity monitor database information. You
have the option to use the embedded PostgreSQL database, which Cloudera Manager can set up
automatically.
Perform Prerequisite Steps
Warning: Cloudera Manager 5 does not support CDH 3 and you cannot upgrade Cloudera Manager 4
to Cloudera Manager 5 if you have a cluster running CDH 3. Therefore, to upgrade CDH 3 clusters to
CDH 4 using Cloudera Manager you must use Cloudera Manager 4.
Perform the following before upgrading to Cloudera Manager 5:
• Upgrade Cloudera Manager 3.7.x to Cloudera Manager 4 - See Upgrading Cloudera Manager 3.7.x on page
61.
52 | Cloudera Manager Administration Guide
Upgrading Cloudera Manager
• Upgrade all CDH 3 clusters to CDH 4 - See Upgrading CDH 3. If you attempt to upgrade to Cloudera Manager
5 and Cloudera Manager 4 is managing a CDH 3 cluster, the Cloudera Manager 5 server will not start, and
you will be notified that you must downgrade to Cloudera Manager 4. Instructions for downgrading may be
found here: Reverting a Failed Cloudera Manager Upgrade on page 62. After downgrading, you must upgrade
your CDH 3 cluster to CDH 4 before you can upgrade Cloudera Manager. See Upgrading CDH 3.
• Obtain host credentials - You must have SSH access and be able to log in using a root account or an account
that has password-less sudo permission. See Cloudera Manager Requirements for more information.
• Stop running commands - Use the Admin Console to check for any running commands. You can either wait
for commands to complete or abort any running commands. For more information on viewing and aborting
running commands, see Viewing Running and Recent Commands.
• Prepare databases - See Database Considerations for Cloudera Manager Upgrades on page 42.
• Cloudera Manager 5 supports HDFS High Availability only with Automatic Failover. If your cluster has enabled
High Availability without Automatic Failover, you must enable Automatic Failover before upgrading to Cloudera
Manager 5. See Configuring HDFS High Availability.
Stop Selected Services
If your cluster meets any of the conditions listed in the following table, you must stop the indicated services or
roles.
Condition
Procedure
Running a version of Cloudera Manager that has the
Cloudera Management Service
Stop the Cloudera Management Service.
Upgrading from Cloudera Manager 4.5 or later, and
Stop the services that have a dependency on the Hive
using the embedded PostgreSQL database for the Hive Metastore (Hue, Impala, and Hive). You will not be able
Metastore
to stop the Cloudera Manager Server database while
these services are running. If you attempt to upgrade
while the embedded database is running, the upgrade
will fail. Stop services that depend on the Hive
Metastore in the following order:
1. Stop the Hue and Impala services.
2. Stop the Hive service.
Running Cloudera Navigator
Stop any of the following roles whose service's Queue
Policy configuration (navigator.batch.queue_policy) is
set to SHUTDOWN:
•
•
•
•
HDFS - NameNode
HBase - Master and RegionServers
Hive - HiveServer2
Hue - Beeswax Server
Stopping these roles renders any service depending on
these roles unavailable. For the HDFS - NameNode
case this implies most of the services in the cluster will
be unavailable until the upgrade is finished.
Stop Cloudera Manager Server, Database, and Agent
1. On the host running the Cloudera Manager Server, stop the Cloudera Manager Server:
$ sudo service cloudera-scm-server stop
Cloudera Manager Administration Guide | 53
Upgrading Cloudera Manager
2. If you are using the embedded PostgreSQL database for Cloudera Manager, stop the database:
$ sudo service cloudera-scm-server-db stop
Important: If you are not running the embedded database service and you attempt to stop it, you
will get a message to the effect that the service cannot be found. If instead you get a message
that the shutdown failed, this means the embedded database is still running, probably due to
services connected to the Hive Metastore. Do not proceed with the installation until you have
stopped all your Metastore-dependent services and the database successfully shuts down (restart
the Cloudera Manager server to shut down services as necessary). If you continue without solving
this, your upgrade will fail and you will be left with a non-functional Cloudera Manager installation.
3. If the Cloudera Manager host is also running the Cloudera Manager Agent, stop the Cloudera Manager Agent:
$ sudo service cloudera-scm-agent stop
(Optional) Upgrade JDK on Cloudera Manager Server Host and Agent Hosts
If you are manually upgrading the Cloudera Manager Agent packages in Upgrade Cloudera Manager Agent
Packages on page 56, and you plan to upgrade to CDH 5, install JDK1.7u45 on the Agent hosts following the
instructions in Java Development Kit Installation.
If you are not running Cloudera Manager Server on the same host as a Cloudera Manager Agent, and you want
all hosts to run the same JDK version, optionally install JDK1.7u45 on that host.
Upgrade Cloudera Manager Server Packages
1. To upgrade the Cloudera Manager Server Packages, you can upgrade from Cloudera's repository at
http://archive.cloudera.com/cm5/ or you can create your own repository, as described in Understanding
Custom Installation Solutions. Creating your own repository is necessary if you are upgrading a cluster that
does not have access to the Internet.
a. Find the Cloudera repo file for your distribution by starting at http://archive.cloudera.com/cm5/
and navigating to the directory that matches your operating system.
For example, for Red Hat or CentOS 6, you would navigate to
http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/. Within that directory, find the repo file
that contains information including the repository's base URL and GPG key. The contents of the
cloudera-manager.repo file might appear as follows:
[cloudera-manager]
# Packages for Cloudera Manager, Version 5, on RedHat or CentOS 6 x86_64
name=Cloudera Manager
baseurl=http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/
gpgkey = http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/RPM-GPG-KEY-cloudera
gpgcheck = 1
For Ubuntu or Debian systems, the repo file can be found by navigating to the appropriate release directory,
for example, http://archive.cloudera.com/cm4/debian/wheezy/amd64/cm. The repo file, in this
case, cloudera.list, may appear as follows:
# Packages for Cloudera Manager, Version 5, on Debian 7.0 x86_64
deb http://archive.cloudera.com/cm5/debian/wheezy/amd64/cm wheezy-cm5 contrib
deb-src http://archive.cloudera.com/cm5/debian/wheezy/amd64/cm wheezy-cm5 contrib
b. Replace the repo file in the configuration location for the package management software for your system.
54 | Cloudera Manager Administration Guide
Upgrading Cloudera Manager
Operating System
Commands
RHEL
Copy cloudera-manager.repo to /etc/yum.repos.d/.
SLES
Copy cloudera-manager.repo to /etc/zypp/repos.d/.
Ubuntu or Debian
Copy cloudera.list to /etc/apt/sources.list.d/.
c. Run the following commands:
Operating System
Commands
RHEL
$ sudo yum clean all
$ sudo yum upgrade 'cloudera-*'
Note:
• yum clean all cleans up yum's cache directories, ensuring that you
download and install the latest versions of the packages
• If your system is not up to date, and any underlying system
components need to be upgraded before this yum update can succeed.
yum will tell you what those are.
SLES
$ sudo zypper clean --all
$ sudo zypper up -r
http://archive.cloudera.com/cm5/sles/11/x86_64/cm/5/
To download from your own repository:
$ sudo zypper clean --all
$ sudo zypper rr cm
$ sudo zypper ar -t rpm-md
http://myhost.example.com/<path_to_cm_repo>/cm
$ sudo zypper up -r http://myhost.example.com/<path_to_cm_repo>
Ubuntu or Debian
Use the following commands to clean cached repository information and update
Cloudera Manager components:
$ sudo apt-get clean
$ sudo apt-get update
$ sudo apt-get dist-upgrade
$ sudo apt-get install cloudera-manager-server
cloudera-manager-agent cloudera-manager-daemons
As this process proceeds, you may be prompted concerning your configuration file
version:
Configuration file `/etc/cloudera-scm-agent/config.ini'
==> Modified (by you or by a script) since installation.
==> Package distributor has shipped an updated version.
What would you like to do about it ? Your options are:
Y or I : install the package maintainer's version
N or O : keep your currently-installed version
D : show the differences between the versions
Z : start a shell to examine the situation
The default action is to keep your current version.
You will receive a similar prompt for /etc/cloudera-scm-server/db.properties.
Answer N to both these prompts. The config.ini file should be carefully inspected
and the files merged together to ensure the new entries are incorporated.
At the end of this process you should have the following packages, corresponding to the version of Cloudera
Manager you installed, on the host that will become the Cloudera Manager Server host.
Cloudera Manager Administration Guide | 55
Upgrading Cloudera Manager
OS
Packages
RPM-based distributions
Ubuntu or Debian
$ rpm -qa 'cloudera-manager-*'
cloudera-manager-agent-5.0.7-0.cm5.p0.932.el6.x86_64
cloudera-manager-server-5.0.7-0.cm5.p0.932.el6.x86_64
cloudera-manager-daemons-5.0.7-0.cm5.p0.932.el6.x86_64
~# dpkg-query -l 'cloudera-manager-*'
Desired=Unknown/Install/Remove/Purge/Hold
|
Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name
Version
Description
+++-======================-======================-============================================================
ii cloudera-manager-agent 5.0.7-0.cm5.p0.932~sq The Cloudera
Manager Agent
ii cloudera-manager-daemo 5.0.7-0.cm5.p0.932~sq Provides daemons
for monitoring Hadoop and related tools.
ii cloudera-manager-serve 5.0.7-0.cm5.p0.932~sq The Cloudera
Manager Server
You may also see an entry for the cloudera-manager-server-db-2 if you are using the embedded database,
and additional packages for plug-ins, depending on what was previously installed on the server host. If the
cloudera-manager-server-db-2 package is installed, and you don't plan to use the embedded database, you
can remove this package.
Start the Cloudera Manager Server
On the Cloudera Manager Server host (the system on which you installed the cloudera-manager-server
package) do the following:
1. If you are using the embedded PostgreSQL database for Cloudera Manager, start the database:
$ sudo service cloudera-scm-server-db start
2. Start the Cloudera Manager Server:
$ sudo service cloudera-scm-server start
You should see the following:
Starting cloudera-scm-server:
[
OK
]
Note: If you have problems starting the server, such as database permissions problems, you can use
the server's log /var/log/cloudera-scm-server/cloudera-scm-server.log to troubleshoot the
problem.
Upgrade Cloudera Manager Agent Packages
Important: All hosts in the cluster must have access to the Internet if you plan to use
archive.cloudera.com as the source for installation files. If you do not have Internet access, create
a custom repository.
1. Log in to the Cloudera Manager Admin Console.
2. Upgrade hosts using one of the following methods:
• Cloudera Manager installs Agent software
56 | Cloudera Manager Administration Guide
Upgrading Cloudera Manager
1. Select Yes, I would like to upgrade the Cloudera Manager Agent packages now and click Continue.
2. Select the release of the Cloudera Manager Agent to install. Normally, this will be the Matched Release
for this Cloudera Manager Server. However, if you used a custom repository (that is, a repository other
than archive.cloudera.com) for the Cloudera Manager server, select Custom Repository and provide
the required information. The custom repository allows you to use an alternative location, but that
location must contain the matched Agent version. Click Continue to proceed to the Configure Java
Encryption screen.
3. If local laws permit you to deploy unlimited strength encryption, and you want to run a secure cluster,
check the Install Java Unlimited Strength Encryption Policy Files checkbox to install unlimited strength
policy files for Java. This is required because during upgrade Cloudera Manager installs a copy of the
Java 7 JDK, which does not include the unlimited strength policy files. Click Continue to proceed to the
Upgrade Cloudera Manager Agent Packages screen.
4. Specify credentials and initiate Agent installation:
a. Select root or enter the user name for an account that has password-less sudo permission.
b. Select an authentication method:
• If you choose to use password authentication, enter and confirm the password.
• If you choose to use public-key authentication provide a passphrase and path to the required
key files.
c. You can choose to specify an alternate SSH port. The default value is 22.
d. You can specify the maximum number of host installations to run at once. The default value is 10.
5. Click Continue. The Cloudera Manager Agent packages are installed.
• Manually install Agent software. On all cluster hosts except the Cloudera Manager server host:
1. Select No, I would like to skip the agent upgrade now and click Continue.
2. Copy the appropriate repo file as described in step 3 of Upgrade Cloudera Manager Server Packages
on page 54.
3. Run the following commands:
Operating System Commands
RHEL
$ sudo yum clean all
$ sudo yum upgrade 'cloudera-*'
Note:
• yum clean all cleans up yum's cache directories, ensuring that
you download and install the latest versions of the packages
• If your system is not up to date, and any underlying system
components need to be upgraded before this yum update can
succeed. yum will tell you what those are.
SLES
$ sudo zypper clean --all
$ sudo zypper up -r
http://archive.cloudera.com/cm5/sles/11/x86_64/cm/5/
To download from your own repository:
$ sudo zypper clean --all
$ sudo zypper rr cm
$ sudo zypper ar -t rpm-md
http://myhost.example.com/<path_to_cm_repo>/cm
$ sudo zypper up -r http://myhost.example.com/<path_to_cm_repo>
Ubuntu or Debian
Use the following commands to clean cached repository information and update
Cloudera Manager components:
$ sudo apt-get clean
$ sudo apt-get update
Cloudera Manager Administration Guide | 57
Upgrading Cloudera Manager
Operating System Commands
$ sudo apt-get dist-upgrade
$ sudo apt-get install cloudera-manager-agent
cloudera-manager-daemons
As this process proceeds, you may be prompted concerning your configuration
file version:
Configuration file `/etc/cloudera-scm-agent/config.ini'
==> Modified (by you or by a script) since installation.
==> Package distributor has shipped an updated version.
What would you like to do about it ? Your options are:
Y or I : install the package maintainer's version
N or O : keep your currently-installed version
D : show the differences between the versions
Z : start a shell to examine the situation
The default action is to keep your current version.
You will receive a similar prompt for
/etc/cloudera-scm-server/db.properties. Answer N to both these prompts.
The config.ini file should be carefully inspected and the files merged together
to ensure the new entries are incorporated.
4. If local laws permit you to deploy unlimited strength encryption, and you want to run a secure cluster,
install the unlimited strength JCE policy files. This is required because during upgrade Cloudera Manager
installs a copy of the Java 7 JDK, which does not include the unlimited strength policy files.
3. Click Continue. The Host Inspector runs to inspect your managed hosts for correct versions and configurations.
If there are problems, you can make changes and then re-run the inspector. When you are satisfied with the
inspection results, click Continue and then click Finish.
4. If you are upgrading from a free version of Cloudera Manager prior to 4.6, click Continue to assign the Cloudera
Management Services roles to hosts.
5. If you are upgrading from a free version of Cloudera Manager prior to 4.6 to Cloudera Enterprise, specify
required databases:
a. Configure settings for required databases:
a. Choose the database type:
• Leave the default setting of Use Embedded Database to have Cloudera Manager create and configure
all required databases. Make a note of the auto-generated passwords.
• Select Use Custom Databases to specify external databases. Enter the database host, database
type, database name, username, and password for the databases that you created when you set
up databases for Cloudera Manager.
1. Provide information for the Activity Monitor (only needed when using MapReduce), Reports
Manager, and Hive Metastore, and Cloudera Navigator databases. The value you enter as the
database hostname must match the value you entered for the hostname (if any) when you
created the database.
b. Click Test Connection to confirm that Cloudera Manager can communicate with the databases using
the information you have supplied. If the test succeeds in all cases, click Continue; otherwise check
and correct the information you have provided for the databases and then try the test again. (For Hive,
if you are using the embedded database, you will see a message saying the database will be created
at a later point in the installation process.) The Review Changes page displays.
6. Review the configuration changes to be applied. Confirm the settings entered for file system paths. The file
paths required vary based on the services to be installed. For example, you might confirm the NameNode
Data Directory and the DataNode Data Directory for HDFS.
58 | Cloudera Manager Administration Guide
Upgrading Cloudera Manager
Warning: DataNode data directories should not be placed on NAS devices.
Click Continue. The wizard starts the services.
7. Click Finish. If you are upgrading from Cloudera Manager prior to 4.5:
a. Select the host for the Hive Metastore Server role.
b. Review the configuration values and click Accept to continue.
Note:
• If Hue was using a Hive Metastore backed by a Derby database, then the newly created Hive
Metastore Server will also use Derby. Since Derby does not allow concurrent connections,
Hue will continue to work, but the new Hive Metastore Server will fail to run. The failure is
harmless (because nothing uses this new Hive Metastore Server at this point) and intentional,
to preserve the set of cluster functionality as it was before upgrade. Cloudera discourages
the use of a Derby backed Hive Metastore due to its limitations. You should consider
switching to a different supported database.
• Prior to Cloudera Manager 4.5, Hue and Impala connected directly to the Hive Metastore
database, so the bypass mode is enabled by default when upgrading to Cloudera Manager
4.5 or later. This is to ensure the upgrade doesn't disrupt your existing setup. You should
plan to disable the bypass mode, especially when using CDH 4.2 or later. Using the Hive
Metastore Server is the recommended configuration and the WebHCat Server role requires
the Hive Metastore Server to not be bypassed. To disable bypass mode, see Disabling Bypass
Mode. After changing this configuration, you must re-deploy your client configurations,
restart Hive, and restart any Hue or Impala services configured to use that Hive.
• If you are using CDH 4.0 or CDH 4.1, see known issues related to Hive in Known Issues and
Workarounds in Cloudera Manager 5.
8. If you are upgrading from Cloudera Manager prior to 4.8, select where the Impala Catalog Server role will run.
All services (except for the services you stopped in Stop Selected Services on page 53) should be running.
Verify the Upgrade Succeeded
If the commands to update and start the Cloudera Manager Server complete without errors, you can assume
the upgrade has completed as desired. For additional assurance, you can check that the server versions have
been updated.
1. In the Cloudera Manager Admin Console, click the Hosts tab.
2. Click Host Inspector. On large clusters, the host inspector may take some time to finish running. You must
wait for the process to complete before proceeding to the next step.
3. Click Show Inspector Results. All results from the host inspector process are displayed including the currently
installed versions. If this includes listings of current component versions, the installation completed as
expected.
Add Hive Gateway Roles
If you are upgrading from a release prior to Cloudera Manager 4.5, add Hive gateway roles to any hosts where
Hive clients should run.
1. In the Cloudera Manager Admin Console, from the Home page click the Hive service.
2. Go to the Instances tab, and click the Add button. This opens the Add Role Instances page.
3. Select the hosts on which you want a Hive Gateway role to run. This will ensure that the Hive client
configurations are deployed on these hosts.
Cloudera Manager Administration Guide | 59
Upgrading Cloudera Manager
Configure Cluster Version for Package Installs
Because Cloudera Manager does not manage service software installed as packages, during certain upgrade
scenarios Cloudera Manager assigns a default CDH version of a cluster. You must manually configure the cluster
CDH version to match the package CDH version following the procedure in Configuring the CDH Version for a
Cluster in Managing Clusters with Cloudera Manager. If you do not set the cluster CDH version to the package
CDH version, Cloudera Manager will incorrectly enable and disable service features based on the configured CDH
version.
Upgrade Impala
If your version of Impala was 1.1 or earlier, upgrade to Impala 1.2.1 or later.
(Optional) Hard Restart Cloudera Manager Agents
Several conditions require that you hard restart the Cloudera Manager Agents:
• To deploy a fix to an issue where Cloudera Manager didn't always correctly restart services
• To take advantage of the maximum file descriptor feature
• To enable HDFS DataNodes to start if you plan to perform the step (Optional) Upgrade CDH on page 61 after
upgrading Cloudera Manager
To address any of these conditions, at a convenient time perform the following steps:
1. Stop all services, including the Cloudera Management Service.
2. On all hosts with Cloudera Manager Agents, run the command:
$ sudo service cloudera-scm-agent hard_restart
3. Start all services.
(Optional) Restart All Services
Cloudera Manager 5 has added monitoring support for all roles that were not previously monitored. However,
the Cloudera Manager Agent will not start sending monitoring data for these roles until:
1. The Cloudera Manager Agent has been upgraded and restarted.
2. The monitored roles have been restarted.
Until you restart the roles, some data will not be present in the monitoring charts and health tests. This is
generally innocuous and simply results in charts that do not display any data and unknown health tests. To
enable monitoring for all roles, at a convenient time perform the following steps:
1. Restart all services, including the Cloudera Management Service:
•
•
From the Home page click
next to the cluster name and select Restart.
From the Home page click
next to the Cloudera Management Service and select Restart.
Restart Roles of Audited Services
If are running Cloudera Navigator and you haven't performed (Optional) Restart All Services on page 60, restart
the following roles of audited services:
•
•
•
•
HDFS - NameNode
HBase - Master and RegionServers
Hive - HiveServer2
Hue - Beeswax Server
60 | Cloudera Manager Administration Guide
Upgrading Cloudera Manager
Start Selected Services
If you didn't perform (Optional) Hard Restart Cloudera Manager Agents on page 60, (Optional) Restart All Services
on page 60 or Restart Roles of Audited Services on page 60, start services you shut down in Stop Selected
Services on page 53:
1. Do one of the following, depending on which services you shut down:
•
•
From the Home page click
next to the cluster name and select Start.
From the Home page click
next to the name of each service you shut down and select Start.
2. In the confirmation dialog that displays, click Start.
Deploy Updated Client Configurations
The services whose client configurations require redeployment are indicated with
icon on the Home page
Status tab. To ensure clients have current information about resources, update client configuration:
1.
From the Home page click
next to the cluster name and select Deploy Client Configuration.
2. In the confirmation dialog that displays, click Deploy Client Configuration.
Test the Installation
When you have finished the upgrade to Cloudera Manager, you can test the installation to verify that the
monitoring features are working as expected; follow instructions under Testing the Installation.
(Optional) Upgrade CDH
Cloudera Manager 5 can manage both CDH 4 and CDH 5, so upgrading existing CDH 4 installations is not required,
but you may want to upgrade to the latest version. For more information on upgrading CDH, see Upgrading CDH
and Managed Services.
Upgrading Cloudera Manager 3.7.x
Warning: Cloudera Manager 3 and CDH 3 have reached End of Maintenance (EOM) as of June 20,
2013. Cloudera does not support or provide patches for Cloudera Manager 3 and CDH 3 releases.
You cannot upgrade directly from Cloudera Manager 3.7.x to Cloudera Manager 5; you must upgrade to Cloudera
Manager 4 first before upgrading to Cloudera Manager 5. Follow the instructions for upgrading Cloudera Manager
3.7.x to Cloudera Manager 4 in Upgrade Cloudera Manager 3.7.x to the Latest Cloudera Manager.
Note that the last step in the Cloudera Manager upgrade process is an optional step to upgrade CDH. If you are
running CDH 3, this step is not optional. Cloudera Manager 5 does not support CDH 3 and will not allow you to
complete the upgrade if it detects a CDH 3 cluster. You must upgrade to CDH 4 before you can upgrade to Cloudera
Manager 5. Follow the steps at Upgrading CDH 3 to CDH 4 in a Cloudera Manager Deployment before you attempt
to upgrade your Cloudera Manager Server to version 5.
Re-Running the Cloudera Manager Upgrade Wizard
The first time you log in to the Cloudera Manager server after upgrading your Cloudera Manager software, the
upgrade wizard runs. If you did not complete the wizard at that time, or if you had hosts that were unavailable
at that time and still need to be upgraded, you can re-run the upgrade wizard:
1. Click the Hosts tab.
Cloudera Manager Administration Guide | 61
Upgrading Cloudera Manager
2. Click Re-run Upgrade Wizard. This takes you back through the installation wizard to upgrade Cloudera
Manager Agents on your hosts as necessary.
3. Select the release of the Cloudera Manager Agent to install. Normally, this will be the Matched Release for
this Cloudera Manager Server. However, if you used a custom repository (that is, a repository other than
archive.cloudera.com) for the Cloudera Manager server, select Custom Repository and provide the required
information. The custom repository allows you to use an alternative location, but that location must contain
the matched Agent version. Click Continue to proceed to the Configure Java Encryption screen.
4. Specify credentials and initiate Agent installation:
a. Select root or enter the user name for an account that has password-less sudo permission.
b. Select an authentication method:
• If you choose to use password authentication, enter and confirm the password.
• If you choose to use public-key authentication provide a passphrase and path to the required key files.
c. You can choose to specify an alternate SSH port. The default value is 22.
d. You can specify the maximum number of host installations to run at once. The default value is 10.
When you click Continue the Cloudera Manager Agent is upgraded on all the currently managed hosts. You
cannot search for new hosts through this process. To add hosts to your cluster, click the Add New Hosts to
Cluster button.
Reverting a Failed Cloudera Manager Upgrade
If you have a CDH 3 cluster running under Cloudera Manager 4, you cannot upgrade to Cloudera Manager 5
because it does not support CDH 3. Likewise, an upgrade from Cloudera Manager 3 to Cloudera Manager 5 is
not supported. In either case, the Cloudera Manager 5 server will not start, and you must now downgrade your
Cloudera Manager server, back to the version you were using prior to attempting the upgrade.
Important: The following instructions assume that a Cloudera Manager upgrade failed, and that the
upgraded server never started, so that the remaining steps of the upgrade process were not performed.
The steps below are not sufficient to revert from a running Cloudera Manager 5 deployment.
Reinstall the Cloudera Manager Server Packages
In this step, you install the Cloudera Manager Server packages to the version you were running previously. You
must reinstall the same version of Cloudera Manager you were using previously, so that the version of your
Cloudera Manager Agents match the server.
The steps below assume that the Cloudera Manager Server is already stopped (as it failed to start after the
attempted upgrade).
1. If you are using the embedded PostgreSQL database for Cloudera Manager, stop the database on the Cloudera
Manager Server host:
$ sudo service cloudera-scm-server-db fast_stop
2. Reinstall the same Cloudera Manager Server version that you were previously running. You can reinstall
from the Cloudera repository at http://archive.cloudera.com/cm4/ or alternately, you can create your
own repository, as described in Understanding Custom Installation Solutions.
a. Find the Cloudera repo file for your distribution by starting at http://archive.cloudera.com/cm4/
and navigating to the directory that matches your operating system.
For example, for Red Hat or CentOS 6, you would navigate to
http://archive.cloudera.com/cm4/redhat/6/x86_64/cm/. Within that directory, find the repo file
62 | Cloudera Manager Administration Guide
Upgrading Cloudera Manager
that contains information including the repository's base URL and GPG key. On CentOS 6, the contents of
the cloudera-manager.repo file might appear as follows:
[cloudera-manager]
# Packages for Cloudera Manager, Version 4, on RedHat or CentOS 6 x86_64
name=Cloudera Manager
baseurl=http://archive.cloudera.com/cm4/redhat/6/x86_64/cm/4/
gpgkey = http://archive.cloudera.com/cm4/redhat/6/x86_64/cm/RPM-GPG-KEY-cloudera
gpgcheck = 1
For Ubuntu or Debian systems, the repo file can be found by navigating to the appropriate directory, for
example, http://archive.cloudera.com/cm4/debian/squeeze/amd64/cm. The repo file, in this case,
cloudera.list, may appear as follows:
# Packages for Cloudera's Distribution for Hadoop, Version 4, on Debian 6.0
x86_64
deb http://archive.cloudera.com/cm4/debian/squeeze/amd64/cm squeeze-cm4 contrib
deb-src http://archive.cloudera.com/cm4/debian/squeeze/amd64/cm squeeze-cm4
contrib
You must edit the file if it exist and modify the URL to reflect the exact version of Cloudera Manager you
are using (unless you want the downgrade to also upgrade to the latest version of Cloudera Manager 4).
The possible versions are shown in the directory on archive.
Setting the URL (an example):
OS
Command
RHEL
Replace baseurl=http://archive.cloudera.com/cm4/redhat/5/x86_64/cm/4/
with
baseurl=http://archive.cloudera.com/cm4/redhat/5/x86_64/cm/4.7.3/
Ubuntu or Debian
Replace deb http://archive.cloudera.com/cm4/debian/squeeze/amd64/cm
squeeze-cm4 contrib with deb
http://archive.cloudera.com/cm4/debian/squeeze/amd64/cm
squeeze-cm4.7.3 contrib
b. Copy the repo file to the configuration location for the package management software for your system:
Operating System
Commands
RHEL
Copy cloudera-manager.repo to /etc/yum.repos.d/.
SLES
Copy cloudera-manager.repo to /etc/zypp/repos.d/.
Ubuntu or Debian
Copy cloudera.list to /etc/apt/sources.list.d/.
c. Run the following commands:
Operating System
Commands
RHEL
$ sudo yum downgrade 'cloudera-*'
SLES
$ sudo zypper clean --all
$ sudo zypper dup -r
http://archive.cloudera.com/cm4/sles/11/x86_64/cm/4/
To download from your own repository:
$ sudo zypper clean --all
$ sudo zypper dup -r http://myhost.example.com/path_to_cm_repo
Ubuntu or Debian
There's no action that will downgrade to the version currently in the repository.
Read DowngradeHowto, download the script described therein, run it, and then run
Cloudera Manager Administration Guide | 63
Upgrading Cloudera Manager
Operating System
Commands
apt-get install for the name=version pairs that it provides for Cloudera
Manager.
At the end of this process you should have the following packages, corresponding to the version of Cloudera
Manager you installed, on the Cloudera Manager Server host. For example, for CentOS,
$ rpm -qa 'cloudera-manager-*'
cloudera-manager-daemons-4.7.3-1.cm473.p0.163.el6.x86_64
cloudera-manager-server-4.7.3-1.cm473.p0.163.el6.x86_64
cloudera-manager-agent-4.7.3-1.cm473.p0.163.el6.x86_64
For Ubuntu or Debian, you should have packages similar to those shown below.
~# dpkg-query -l 'cloudera-manager-*'
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name
Version
Description
+++-======================-======================-============================================================
ii cloudera-manager-agent 4.7.3-1.cm473.p0.163~sq The Cloudera Manager Agent
ii cloudera-manager-daemo 4.7.3-1.cm473.p0.163~sq Provides daemons for monitoring
Hadoop and related tools.
ii cloudera-manager-serve 4.7.3-1.cm473.p0.163~sq The Cloudera Manager Server
You may also see an entry for the cloudera-manager-server-db if you are using the embedded database,
and additional packages for plug-ins, depending on what was previously installed on the server host. If the
commands to update the server complete without errors, you can assume the upgrade has completed as desired.
For additional assurance, you will have the option to check that the server versions have been updated after
you start the server.
Start the Server
On the Cloudera Manager Server host (the system on which you installed the cloudera-manager-server
package) do the following:
1. If you are using the embedded PostgreSQL database for Cloudera Manager, start the database:
$ sudo service cloudera-scm-server-db start
2. Start the server:
$ sudo service cloudera-scm-server start
You should see the following:
Starting cloudera-scm-server:
[
OK
]
Note: If you have problems starting the server, such as database permissions problems, you can
use the server's log /var/log/cloudera-scm-server/cloudera-scm-server.log to
troubleshoot the problem.
64 | Cloudera Manager Administration Guide
Other Cloudera Manager Settings
Other Cloudera Manager Settings
From the Administration tab you can select options for configuring settings that affect how Cloudera Manager
interacts with your cluster.
Administration Settings
The Settings page provides a number of categories as follows:
• Performance - Set the Cloudera Manager Agent heartbeat interval.
• Advanced - Enable API debugging and other advanced options.
• Monitoring - Set Agent health status parameters. For configuration instructions, see Configuring Cloudera
Manager Agents on page 11.
• Security - Set TLS encryption settings to enable TLS encryption between the Cloudera Manager Server, Agents,
and clients. For configuration instructions, see Configuring TLS Security for Cloudera Manager on page 31.
You can also:
– Set the realm for Kerberos security and point to a custom keytab retrieval script. For configuration
instructions, see Configuring Hadoop Security with Cloudera Manager.
– Specify session timeout and a "Remember Me" option.
• Ports and Addresses - Set ports for the Cloudera Manager Admin Console and Server. For configuration
instructions, see Configuring Cloudera Manager Server Ports on page 9.
• Other
– Enable Cloudera usage data collection For configuration instructions, see Managing Anonymous Usage
Data Collection on page 72.
– Set a custom header color and banner text for the Admin console.
– Set an "Information Assurance Policy" statement – this statement will be presented to every user before
they are allowed to access the login dialog. The user must click "I Agree" in order to proceed to the login
dialog.
– Disable/enable the auto-search for the Events panel at the bottom of a page.
• Support
– Configure diagnostic data collection properties. See Diagnostic Data Collection on page 72.
– Configure how to access Cloudera Manager help files. By default, when you click the Help link under the
Support menu in the Cloudera Manager Admin console, Help files from the Cloudera web site are opened.
This is because local Help files are not updated after installation. You can configure Cloudera Manager to
open either the latest Help files from the Cloudera web site (this option requires Internet access from the
browser) or locally-installed Help files by configuring the property Open latest Help files from the Cloudera
website.
• External Authentication - Specify the configuration to use LDAP, Active Directory, or an external program for
authentication. See Configuring External Authentication on page 24 for instructions.
• Parcels - Configure settings for parcels, including the location of remote repositories that should be made
available for download, and other settings such as the frequency with which Cloudera Manager will check
for new parcels, limits on the number of downloads or concurrent distribution uploads. See Parcels for more
information.
• Network - Configure proxy server settings.
• Custom Service Descriptors - Configure custom service descriptor properties for Add-on Services.
Cloudera Manager Administration Guide | 65
Other Cloudera Manager Settings
User Interface Language Settings
You can change the language of the Cloudera Manager Admin Console User Interface through the language
preference in your browser. Information on how to do this for the browsers supported by Cloudera Manager is
shown under the Administration page. You can also change the language for the information provided with
activity and health events, and for alert email messages by selecting Language, selecting the language you want
from the drop-down list on this page, then clicking Save Changes.
Managing Licenses
When you install Cloudera Manager, you can choose to select Cloudera Express (no license required), a 60-day
Cloudera Enterprise Data Hub Edition trial license, or Cloudera Enterprise (which requires a license). You can
later end a trial license or upgrade your license.
About Trial Licenses
You can use the trial license only once; once the 60-day trial period has expired or you have ended the trial, you
cannot restart it.
When a trial ends, features that require a Cloudera Enterprise license immediately become unavailable. However,
data or configurations associated with the disabled functions are not deleted, and become available again when
you install a Cloudera Enterprise license. Trial expiration or termination has the following effects:
•
•
•
•
•
Only local users can log in (no LDAP or SAML authentication).
Configuration history is unavailable.
Alerts cannot be delivered as SNMP traps.
Operational reports are inaccessible (but remain in the database).
Commands such as Rolling Restart, History and Rollback (under the Configuration tab), Send Diagnostic Data,
and starting Cloudera Navigator roles are disabled or not available.
Accessing the License Page
To access the license page, select Administration > License.
If you have a license installed, the license page indicates its status (for example, whether your license is currently
valid) and displays the license details: the license owner, the license key, and the expiration date of the license,
if there is one.
At the right side of the page a table shows the usage of licensed components based on the number of hosts
with those products installed. You can move the cursor over the
to see an explanation of each item.
• Basic Edition - a cluster running core CDH services: HDFS, Hive, Hue, MapReduce, Oozie, Sqoop, YARN, and
ZooKeeper.
• Flex Edition - a cluster running core CDH services plus one of the following: Accumulo, HBase, Impala, Navigator,
Solr, Spark.
• Data Hub Edition - a cluster running core CDH services plus any of the following: Accumulo, HBase, Impala,
Navigator, Solr, Spark.
Ending a Cloudera Enterprise Data Hub Edition Trial
If you are using the trial edition the License page indicates when your license will expire. However, you can end
the trial at any time (prior to expiration) as follows:
1. On the License page, click End Trial.
2. Confirm that you want to end the trial.
3. Restart the Cloudera Management Service, HBase, HDFS, and Hive services to pick up configuration changes.
66 | Cloudera Manager Administration Guide
Other Cloudera Manager Settings
Upgrading from Cloudera Express to a Cloudera Enterprise Data Hub Edition Trial
To start a trial, on the License page, click Try Cloudera Enterprise Data Hub Edition for 60 Days.
1. Cloudera Manager displays a pop-up describing the features enabled with Cloudera Enterprise Data Hub
Edition. Click OK to proceed. At this point, your installation is upgraded and the Customize Role Assignments
page displays.
2. Under Reports Manager click Select a host. The pageable host selection dialog displays.
The following shortcuts for specifying host names are supported:
• Range of hostnames (without the domain portion)
Range Definition
Matching Hosts
10.1.1.[1-4]
10.1.1.1, 10.1.1.2, 10.1.1.3, 10.1.1.4
host[1-3].company.com
host1.company.com, host2.company.com, host3.company.com
host[07-10].company.com
host07.company.com, host08.company.com, host09.company.com,
host10.company.com
• IP addresses
• Rack name
3. Select a host and click OK.
4. When you are satisfied with the assignments, click Continue. The Database Setup page displays.
5. Configure settings for required databases:
a. Choose the database type:
• Leave the default setting of Use Embedded Database to have Cloudera Manager create and configure
all required databases. Make a note of the auto-generated passwords.
• Select Use Custom Databases to specify external databases. Enter the database host, database type,
database name, username, and password for the databases that you created when you set up databases
for Cloudera Manager.
1. Provide information for the Activity Monitor (only needed when using MapReduce), Reports Manager,
and Hive Metastore, and Cloudera Navigator databases. The value you enter as the database
hostname must match the value you entered for the hostname (if any) when you created the
database.
b. Click Test Connection to confirm that Cloudera Manager can communicate with the databases using the
information you have supplied. If the test succeeds in all cases, click Continue; otherwise check and correct
the information you have provided for the databases and then try the test again. (For Hive, if you are
using the embedded database, you will see a message saying the database will be created at a later point
in the installation process.) The Review Changes page displays.
6. Review the configuration changes to be applied. Confirm the settings entered for file system paths. The file
paths required vary based on the services to be installed. For example, you might confirm the NameNode
Data Directory and the DataNode Data Directory for HDFS.
Warning: DataNode data directories should not be placed on NAS devices.
Click Continue. The wizard starts the services.
7. At this point, your installation is upgraded. Click Continue.
8. Restart Cloudera Management Services and audited services to pick up configuration changes. The audited
services will write audit events to a log file, but the events are not transferred to the Cloudera Navigator
Audit Server until you add and start the Cloudera Navigator Audit Server role as described in Adding and
Starting Cloudera Navigator Roles on page 20. For information on Cloudera Navigator, see Cloudera Navigator
documentation.
Cloudera Manager Administration Guide | 67
Other Cloudera Manager Settings
Upgrading from a Cloudera Enterprise Data Hub Edition Trial to Cloudera Enterprise
1.
2.
3.
4.
5.
Purchase a Cloudera Enterprise license from Cloudera.
On the License page, click Upload License.
Click the document icon to the left of the Select a License File text field.
Navigate to the location of your license file, click the file, and click Open.
Click Upload.
Upgrading from Cloudera Express to Cloudera Enterprise
1.
2.
3.
4.
5.
6.
Purchase a Cloudera Enterprise license from Cloudera.
On the License page, click Upload License.
Click the document icon to the left of the Select a License File text field.
Navigate to the location of your license file, click the file, and click Open.
Click Upload.
Cloudera Manager displays a pop-up describing the features enabled with Cloudera Enterprise Data Hub
Edition. Click OK to proceed. At this point, your installation is upgraded and the Customize Role Assignments
page displays.
7. Under Reports Manager click Select a host. The pageable host selection dialog displays.
The following shortcuts for specifying host names are supported:
• Range of hostnames (without the domain portion)
Range Definition
Matching Hosts
10.1.1.[1-4]
10.1.1.1, 10.1.1.2, 10.1.1.3, 10.1.1.4
host[1-3].company.com
host1.company.com, host2.company.com, host3.company.com
host[07-10].company.com
host07.company.com, host08.company.com, host09.company.com,
host10.company.com
• IP addresses
• Rack name
8. When you are satisfied with the assignments, click Continue. The Database Setup page displays.
9. Configure settings for required databases:
a. Choose the database type:
• Leave the default setting of Use Embedded Database to have Cloudera Manager create and configure
all required databases. Make a note of the auto-generated passwords.
• Select Use Custom Databases to specify external databases. Enter the database host, database type,
database name, username, and password for the databases that you created when you set up databases
for Cloudera Manager.
1. Provide information for the Activity Monitor (only needed when using MapReduce), Reports Manager,
and Hive Metastore, and Cloudera Navigator databases. The value you enter as the database
hostname must match the value you entered for the hostname (if any) when you created the
database.
b. Click Test Connection to confirm that Cloudera Manager can communicate with the databases using the
information you have supplied. If the test succeeds in all cases, click Continue; otherwise check and correct
the information you have provided for the databases and then try the test again. (For Hive, if you are
using the embedded database, you will see a message saying the database will be created at a later point
in the installation process.) The Review Changes page displays.
68 | Cloudera Manager Administration Guide
Other Cloudera Manager Settings
10. Review the configuration changes to be applied. Confirm the settings entered for file system paths. The file
paths required vary based on the services to be installed. For example, you might confirm the NameNode
Data Directory and the DataNode Data Directory for HDFS.
Warning: DataNode data directories should not be placed on NAS devices.
Click Continue. The wizard starts the services.
11. At this point, your installation is upgraded. Click Continue.
12. Restart Cloudera Management Services and audited services to pick up configuration changes. The audited
services will write audit events to a log file, but the events are not transferred to the Cloudera Navigator
Audit Server until you add and start the Cloudera Navigator Audit Server role as described in Adding and
Starting Cloudera Navigator Roles on page 20. For information on Cloudera Navigator, see Cloudera Navigator
documentation.
If you want to use the Cloudera Navigator Metadata Server, add its role following the instructions in Adding and
Starting Cloudera Navigator Roles on page 20.
Renewing a License
1.
2.
3.
4.
5.
6.
Download the license file and save it locally.
In Cloudera Manager, go to the Home page.
Select Administration > License.
Click Upload License.
Browse to the license file you downloaded.
Click Upload.
You do not need to restart Cloudera Manager for the new license to take effect.
Managing Alerts
The Administration > Alerts page provides a summary of the settings for alerts in your clusters.
Alert Type The left column lets you select by alert type (Health, Log, or Activity) and within that by service instance.
In the case of Health alerts, you can look at alerts for Hosts as well. You can select an individual service to see
just the alert settings for that service.
Health/Log/Activity Alert Settings Depending on your selection in the left column, the right hand column show
you the list of alerts that are enabled or disabled for the selected service type.
To change the alert settings for a service, click the next to the service name. This will take you to the Monitoring
section of the Configuration tab for the service. From here you can enable or disable alerts and configure
thresholds as needed.
Recipients You can also view the list of recipients configured for the enabled alerts.
Configuring Alert Delivery
When you install Cloudera Manager you can configure the mail server you will use with the Alert Publisher.
However, if you need to change these settings, you can do so under the Alert Publisher section of the Management
Services configuration tab. Under the Alert Publisher role of the Cloudera Manager Management Service, you
can configure email or SNMP delivery of alert notifications.
Configuring Alert Email Delivery
Sending A Test Alert E-mail
Select the Administration > Alerts tab and click the Send Test Alert link.
Cloudera Manager Administration Guide | 69
Other Cloudera Manager Settings
Configuring the List Of Alert Recipient Email Addresses
1. Do one of the following:
• Select the Administration > Alerts tab and click the
• 1. Do one of the following:
to the right of Recipient(s).
• Select Clusters > Cloudera Management Service > mgmt.
• On the Status tab of the Home page, in Cloudera Management Service table, click the mgmt link.
2. Select Configuration > View and Edit.
3. Select the Alert Publisher Default Group role group.
2. Configure the Alerts: Mail Message Recipients property.
3. Click the Save Changes button at the top of the page to save your settings.
4. Restart the Alert Publisher role.
Configuring Alert Email Properties
1. Display the Cloudera Management Service status page.
2. Select Configuration > View and Edit.
3. Select the Alert Publisher Default Group role group to see the list of properties. In order to receive email
alerts you must set (or verify) the following settings:
•
•
•
•
Enable email alerts
Email protocol to use.
Your mail server hostname and port.
The username and password of the email user that will be logged into the mail server as the "sender" of
the alert emails.
• A comma-separated list of email addresses that will be the recipients of alert emails.
• The format of the email alert message. Select json if you need the message to be parsed by a script or
program.
4. Click the Save Changes button at the top of the page to save your settings.
5. Restart the Alert Publisher role.
Configuring Alert SNMP Delivery
Important: This feature is available only with a Cloudera Enterprise license.
For other licenses, the following applies:
• Cloudera Express - the feature is not available.
• Cloudera Enterprise Data Hub Edition Trial - the feature will not be available after you end the
trial or the trial license expires.
To obtain a license for Cloudera Enterprise, please fill in this form or call 866-843-7207. After you
install a Cloudera Enterprise license, the feature will be available.
Enabling, Configuring, and Disabling SNMP Traps
1. Before you enable SNMP traps, configure the trap receiver (Network Management System or SNMP server)
with the Cloudera MIB.
2. Do one of the following:
• Select Clusters > Cloudera Management Service > mgmt.
• On the Status tab of the Home page, in Cloudera Management Service table, click the mgmt link.
70 | Cloudera Manager Administration Guide
Other Cloudera Manager Settings
3. Select Configuration > View and Edit.
4. Select Alert Publisher Default Group > SNMP.
• Enter the DNS name or IP address of the Network Management System (SNMP server) acting as the trap
receiver in the SNMP NMS Hostname property.
• In the SNMP Security Level property, select the version of SNMP you are using: SNMPv2, SNMPv3 without
authentication and without privacy (noAuthNoPriv), or SNMPv3 with authentication and without privacy
(authNoPriv) and specify the required properties:
– SNMPv2 - SNMPv2 Community String.
– SNMPv3 without authentication (noAuthNoPriv) - SNMP Server Engine Id and SNMP Security
UserName.
– SNMPv3 with authentication (authNoPriv) - SNMP Server Engine Id, SNMP Security UserName, SNMP
Authentication Protocol, and SNMP Authentication Protocol Pass Phrase.
• You can also change other settings such as the port, retry, or timeout values.
5. Click Save Changes when you are done.
6. Restart the Alert Publisher role.
To disable SNMP traps, remove the hostname from the SNMP NMS Hostname property
(alert.snmp.server.hostname).
Viewing the Cloudera MIB
1. Do one of the following:
• Select Clusters > Cloudera Management Service > mgmt.
• On the Status tab of the Home page, in Cloudera Management Service table, click the mgmt link.
2. Select Configuration > View and Edit.
3. Select Alert Publisher Default Group > SNMP.
4. In the Description column for the first property (SNMP NMS Hostname) click the SMNP Mib link.
Kerberos
After enabling and configuring Hadoop security using Kerberos on your cluster, you can view and regenerate
the Kerberos principals for your cluster. If you make a global configuration change in your cluster, such as
changing the encryption type, you would use the Kerberos page to regenerate the principals for your cluster. In
a secure cluster, the Kerberos page lists all the Kerberos principals that are active on your cluster.
Regenerating Kerberos Principals
If you make a global configuration change in your cluster, such as changing the encryption type, you must use
the following instructions to regenerate the principals for your cluster.
Important:
• Regenerate principals using the following steps in the Cloudera Manager Admin Console and not
directly using kadmin shell.
• Do not regenerate the principals for your cluster unless you have made a global configuration
change. Before regenerating, be sure to read Step 2: Set up a Cluster-dedicated KDC and Default
Domain for the Hadoop Cluster to avoid making your existing host keytabs invalid.
To view and regenerate the Kerberos principals for your cluster:
1. Select Administration > Kerberos.
Cloudera Manager Administration Guide | 71
Other Cloudera Manager Settings
2. The currently configured Kerberos principals are displayed. If you are running HDFS, the hdfs/hostname and
host/hostname principals are listed. If you are running MapReduce, the mapred/hostname and
host/hostname principals are listed. The principals for other running services are also listed.
3. Only if necessary, select the principals you want to regenerate.
4. Click Regenerate.
The Security Inspector
The Security Inspector uses the Host Inspector to run a security-related set of commands on the hosts in your
cluster. It reports on things such as how Java is configured for encryption and on the default realms configured
on each host:
1. Select Administration > Kerberos.
2. Click Security Inspector. Cloudera Manager begins several tasks to inspect the managed hosts.
3. After the inspection completes, click Download Result Data or Show Inspector Results to review the results.
Sending Usage and Diagnostic Data to Cloudera
Cloudera Manager collects anonymous usage information and takes regularly-scheduled snapshots of the state
of your cluster and automatically sends them anonymously to Cloudera. This helps Cloudera improve and
optimize Cloudera Manager.
If you have a Cloudera Enterprise license, you can also trigger the collection of diagnostic data and send it to
Cloudera Support to aid in resolving a problem you may be having.
Managing Anonymous Usage Data Collection
Cloudera Manager sends anonymous usage information using Google Analytics to Cloudera. The information
helps Cloudera improve Cloudera Manager. By default anonymous usage data collection is enabled.
1. From the Administration tab, select Settings.
2. Under the Other category, set the Allow Usage Data Collection property.
3. Click Save Changes.
Managing Hue Analytics Data Collection
Hue tracks anonymised pages and application versions in order to gather information to help compare each
application's usage levels. The data collected does not include any hostnames or IDs. For example, the data is
of the form: /2.3.0/pig, /2.5.0/beeswax/execute. You can restrict data collection as follows:
1.
2.
3.
4.
5.
6.
Go to the Hue service.
Select Configuration > View and Edit.
Expand the Service-Wide category.
Uncheck the Enable Usage Data Collection checkbox.
Click Save Changes.
Restart the Hue service.
Diagnostic Data Collection
To help with solving problems when using Cloudera Manager on your cluster, Cloudera Manager collects diagnostic
data on a regular schedule, and automatically sends it to Cloudera. By default Cloudera Manager is configured
to collect data weekly and to send it automatically. You can schedule the frequency of data collection on a daily,
weekly, or monthly schedule, or disable the scheduled collection of data entirely. You can also send a collected
data set manually.
72 | Cloudera Manager Administration Guide
Other Cloudera Manager Settings
Note:
• Automatically sending diagnostic data requires the Cloudera Manager Server host to have Internet
access, and be configured for sending data automatically. If your Cloudera Manager server does
not have Internet access, and you have a Cloudera Enterprise license, you can manually send the
diagnostic data as described in Manually Triggering Collection and Transfer of Diagnostic Data to
Cloudera on page 74.
• Automatically sending diagnostic data may fail sometimes and return an error message of "Could
not send data to Cloudera." To work around this issue, you can manually send the data to Cloudera
Support.
What Data Does Cloudera Manager Collect?
Cloudera Manager collects and returns a significant amount of information about the health and performance
of the cluster. It includes:
• Up to 1000 Cloudera Manager audit events: Configuration changes, add/remove of users, roles, services, etc.
• One day's worth of Cloudera Manager events: This includes critical errors Cloudera Manager watches for and
more
• Data about the cluster structure which includes a list of all hosts, roles, and services along with the
configurations that are set through Cloudera Manager. Where passwords are set in Cloudera Manager, the
passwords are not returned.
• Cloudera Manager license and version number.
• Current health information for hosts, service, and roles. Includes results of health tests run by Cloudera
Manager.
• Heartbeat information from each host, service, and role. These include status and some information about
memory, disk, and processor usage.
• The results of running Host Inspector.
• One day's worth of Cloudera Manager metrics.
Note: If you are using Cloudera Express, host metrics are not included.
•
•
•
•
•
•
A download of the debug pages for Cloudera Manager roles.
For each host in the cluster, the result of running a number of system-level commands on that host.
Logs from each role on the cluster, as well as the Cloudera Manager server and agent logs.
Which parcels are activated for which clusters.
Whether there's an active trial, and if so, metadata about the trial.
Metadata about the Cloudera Manager server, such as its JMX metrics, stack traces, and the database/host
it's running with.
• HDFS/Hive replication schedules (including command history) for the deployment.
• Impala query logs.
Configuring the Frequency of Diagnostic Data Collection
By default, Cloudera Manager collects diagnostic data on a weekly basis. You can change the frequency to daily,
weekly, monthly, or never. If you are a Cloudera Enterprise customer and you set the schedule to never you can
still collect and send data to Cloudera on demand. If you are a Cloudera Express customer and you set the
schedule to never, data is not collected or sent to Cloudera.
1. Select Administration > Settings.
2. Under the Support category, click Scheduled Diagnostic Data Collection Frequency and select the frequency.
3. To set the day and time of day that the collection will be performed, click Scheduled Diagnostic Data Collection
Time and specify the date and time in the pop-up control.
4. Click Save Changes.
Cloudera Manager Administration Guide | 73
Other Cloudera Manager Settings
You can see the current setting of the data collection frequency by viewing Support > Scheduled Diagnostics: in
the main navigation bar.
Specifying the Diagnostic Data Directory
You can configure the directory where collected data is stored.
1. Select Administration > Settings.
2. Under the Support category, set the Diagnostic Data Bundle Directory to a directory on the host running
Cloudera Manager Server. The directory must exist and be enabled for writing by the user cloudera-scm. If
this field is left blank, the data is stored in /tmp.
3. Click Save Changes.
Collecting and Sending Diagnostic Data to Cloudera
Important: This feature is available only with a Cloudera Enterprise license.
For other licenses, the following applies:
• Cloudera Express - the feature is not available.
• Cloudera Enterprise Data Hub Edition Trial - the feature will not be available after you end the
trial or the trial license expires.
To obtain a license for Cloudera Enterprise, please fill in this form or call 866-843-7207. After you
install a Cloudera Enterprise license, the feature will be available.
Disabling the Automatic Sending of Diagnostic Data from a Manually Triggered Collection
If you do not want data automatically sent to Cloudera after manually triggering data collection, you can disable
this feature. The data you collect will be saved and can be downloaded for sending to Cloudera Support at a later
time.
1. Select Administration > Settings.
2. Under the Support category, uncheck the box for Send Diagnostic Data to Cloudera Automatically.
3. Click Save Changes.
Note: The Send Diagnostic Data form that displays when you collect data in one of the following
procedures indicates whether the data will be sent automatically.
Manually Triggering Collection and Transfer of Diagnostic Data to Cloudera
1. Optionally change the System Identifier property:
a. Select Administration > Settings.
b. Click the Other category.
c. Set the System Identifier property and click Save Changes.
2. Click the Support menu link.
3. Choose Send Diagnostic Data. The Send Diagnostic Data form displays.
4. Fill in or change the information here as appropriate:
• Cloudera Manager populates the End Time based on the setting of the Time Range selector. You should
change this to be a few minutes after you observed the problem or condition that you are trying to capture.
The time range is based on the timezone of the host where Cloudera Manager Server is running.
• If you have a support ticket open with Cloudera Support, include the support ticket number in the field
provided.
5. Depending on whether you have disabled automatic sending of data, do one of the following:
74 | Cloudera Manager Administration Guide
Other Cloudera Manager Settings
• Click Collect and Send Diagnostic Data. A Running Commands window shows you the progress of the
data collection steps. When these steps are complete, the collected data is sent to Cloudera.
• Click Collect Diagnostic Data. A Command Details window shows you the progress of the data collection
steps.
1. In the Command Details window, click Download Result Data to download and save a zip file of the
information.
2. Send the data to Cloudera Support by doing one of the following:
• 1. Download the phone_home script.
2. Copy the script and the downloaded data file to a host that has Internet access.
3. Run the following command on that host:
python phone_home.py --file <downloaded data file>
• Contact Cloudera Support and arrange to send the data file.
Importing Cloudera Manager Settings
Backing up your Current Deployment
To back up your current deployment, follow the instructions in Back up Databases on page 42. The import feature
should not be relied on for backup and recovery at this time.
Building a Cloudera Manager Deployment
You can use the Cloudera Manager API to programmatically build a Cloudera Manager Deployment — a definition
of all the entities in your Cloudera Manager-managed deployment — clusters, service, roles, hosts, users and
so on. See the Cloudera Manager API documentation on how to manage deployments using the /cm/deployment
resource.
Uploading a Cloudera Manager 4.x Configuration Script
Importing a deployment should be done using the Cloudera Manager API. See the Cloudera Manager API
documentation for details.
Cloudera Manager Administration Guide | 75
Download PDF