没有合适的资源?快使用搜索试试~ 我知道了~
首页CDH操作指南:大数据平台关键工具详解
CDH操作指南:大数据平台关键工具详解
4星 · 超过85%的资源 需积分: 48 114 下载量 30 浏览量
更新于2024-07-18
2
收藏 6.08MB PDF 举报
"CDH操作手册是一本专为Cloudera分布式计算体系(CDH)用户设计的重要指南,由Cloudera公司于2010年至2017年期间发布并享有版权。该手册详细介绍了如何有效地管理和操作Cloudera大数据平台,包括Hadoop技术栈的核心组件,如Hadoop Distributed File System (HDFS)、MapReduce、YARN等。作为Apache软件基金会商标的Hadoop和Elephant logo,在这个文档中同样被提及,体现了其开源社区的特性。 在阅读和使用CDH时,必须遵守严格的版权规定,未经授权,不得复制、模仿或以任何形式使用文档中的内容,包括但不限于文字、图示和商标。提到的其他产品、服务、流程或信息,无论是否采用商标、制造商名称或供应商标识,都不表示我们的推荐或认可,只是作为技术背景的引用。 手册强调用户应确保遵守所有适用的版权法,禁止对文档进行任何形式的复制、存储于检索系统,或通过电子、机械手段如复印传播。这份操作手册旨在帮助CDH用户理解系统的架构、配置、管理和维护,以及如何优化性能和安全性,是任何想要在大数据领域开展工作或研究的专业人员不可或缺的参考资源。书中可能包含详细的安装指南、配置步骤、故障排查技巧、最佳实践和案例分析,以便读者能够高效地在CDH环境中进行数据处理和分析任务。"
资源详情
资源推荐
Configuring Host Monitoring
1. Click the Hosts tab.
2. Select a host.
3. Click the Configuration tab.
4. Select Scope > All.
5. Click the Monitoring category.
6. Configure the property.
7. Click Save Changes to commit the changes.
8. Return to the Home page by clicking the Cloudera Manager logo.
9.
Click the icon that is next to any stale services to invoke the cluster restart wizard.
Configuring Directory Monitoring
Cloudera Manager can perform threshold-based monitoring of free space in the various directories on the hosts it
monitors—such as log directories or checkpoint directories (for the Secondary NameNode).
These thresholds can be set in one of two ways—as absolute thresholds (in terms of MiB and GiB, and so on) or as
percentages of space. As with other threshold properties, you can set values that trigger events at both the Warning
and Critical levels.
If you set both thresholds, the Absolute Threshold setting is used.
Configuring Activity Monitoring
The Activity Monitor monitors the MapReduce MRv1 jobs running on your cluster. This also includes the higher-level
activities, such as Pig, Hive, and Oozie workflows that run as MapReduce tasks.
You can monitor for slow-running jobs or jobs that fail, and alert on these events. To detect jobs that are running too
slowly, you must configure a set of activity duration rules that specify what jobs to monitor, and what the limits on
duration are for those jobs. A "slow activity" event occurs when a job exceeds the duration limit configured for it in
an activity duration rule. Activity duration rules are not defined by default; you must configure these rules if you want
to see events for jobs that exceed the duration defined by these rules.
To configure Activity Monitor settings:
1. Go to the MapReduce service.
2. Click the Configuration tab.
3. Select Scope > MapReduce service_name (Service-Wide).
4. Click the Monitoring category.
5. Specify one or more activity duration rules.
6. Click Save Changes to commit the changes.
7. Return to the Home page by clicking the Cloudera Manager logo.
8.
Click the icon that is next to any stale services to invoke the cluster restart wizard.
Activity Duration Rules
An activity duration rule is a regular expression (used to match an activity name (that is, a Job ID)) combined with a
run time limit which the job should not exceed. You can add as many rules as you like, one per line, in the Activity
Duration Rules property.
The format of each rule is regex=number where the regex is a regular expression to match against the activity name,
and number is the job duration limit, in minutes. When a new activity starts, each regex expression is tested against
the name of the activity for a match.
The list of rules is tested in order, and the first match found is used. For example, if the rule set is:
foo=10
bar=20
16 | Cloudera Operation
Monitoring and Diagnostics
any activity named "foo" would be marked slow if it ran for more than 10 minutes. Any activity named "bar" would be
marked slow if it ran for more than 20 minutes.
Since Java regular expressions can be used, if the rule set is:
foo.*=10
bar=20
any activity with a name that starts with "foo" (for example, fool, food, foot) matches the first rule.
If there is no match for an activity, then that activity is not monitored for job duration. However, you can add a "catch-all"
as the last rule that always matches any name:
foo.*=10
bar=20
baz=30
.*=60
In this case, any job that runs longer than 60 minutes is marked slow and generates an event.
Configuring YARN Application Monitoring
You can configure the visibility of the YARN application monitoring results.
Configuring Application Visibility
To configure whether admin and non-admin users can view all applications, only that user's applications, or no
applications:
1. Go to the YARN service.
2. Click the Configuration tab.
3. Select Scope > YARN service_name (Service-Wide).
4. Click the Monitoring category.
5. Set the Applications List Visibility Settings properties for admin and non-admin users.
6. Click Save Changes to commit the changes.
7. Return to the Home page by clicking the Cloudera Manager logo.
8.
Click the icon that is next to any stale services to invoke the cluster restart wizard.
Configuring Impala Query Monitoring
You can configure the visibility of the Impala query results and the size of the storage allocated to Impala query results.
Configuring Query Visibility
To configure whether admin and non-admin users can view all queries, only that user's queries, or no queries:
1. Go to the Impala service.
2. Click the Configuration tab.
3. Select Scope > Impala service_name (Service-Wide).
4. Click the Monitoring category.
5. Set the Visibility Settings properties for admin and non-admin users.
6. Click Save Changes to commit the changes.
7. Return to the Home page by clicking the Cloudera Manager logo.
8.
Click the icon that is next to any stale services to invoke the cluster restart wizard.
Configuring Impala Query Data Store Maximum Size
The query store stores enough information to make the query searchable through the filter language.
1. Do one of the following:
Cloudera Operation | 17
Monitoring and Diagnostics
Select Clusters > Cloudera Management Service.•
• On the Home > Status tab, in Cloudera Management Service table, click the Cloudera Management Service
link.
2. Click the Configuration tab.
3. Select Scope > Service Monitor.
4. Click the Main category.
5. In the Impala Storage section, set the firehose_impala_storage_bytes property. The default is 1 GiB.
6. Click Save Changes to commit the changes.
7. Return to the Home page by clicking the Cloudera Manager logo.
8.
Click the icon that is next to any stale services to invoke the cluster restart wizard.
The firehose_impala_storage_bytes property determines the approximate amount of disk space dedicated to
storing Impala query data. Once the store reaches its maximum size, older data is deleted to make room for newer
queries. The disk usage is approximate because data deletion begins only when the limit has been reached.
Configuring Alerts
Enabling Activity Monitor Alerts
You can enable alerts when an activity runs too slowly or fails.
1. Go to the MapReduce service.
2. Click the Configuration tab.
3. Select Scope > MapReduce service_name (Service-Wide).
4. Click the Monitoring category.
5. Check the Alert on Slow Activities or Alert on Activity Failure checkboxes.
6. Click Save Changes to commit the changes.
7. Return to the Home page by clicking the Cloudera Manager logo.
8.
Click the icon that is next to any stale services to invoke the cluster restart wizard.
Enabling Configuration Change Alerts
Configuration change alerts can be set service wide, or on specific roles for the service.
1. Click a service, role, or host.
2. Click the Configuration tab.
3. Select Scope > All.
4. Click the Monitoring category.
5. Check the Enable Configuration Change Alerts checkbox.
6. Click Save Changes to commit the changes.
7. Return to the Home page by clicking the Cloudera Manager logo.
8.
Click the icon that is next to any stale services to invoke the cluster restart wizard.
Enabling HBase Alerts
1. Go to the HBase service.
2. Click the Configuration tab.
3. Select Scope > HBase service_name (Service-Wide).
4. Click the Monitoring category.
5. Set one of the region or Hbck alerts:
• Hbck Region Error Count
• Hbck Error Count
• Hbck Alert Error Codes
• Hbck Slow Run
• Region Health Canary Slow Run
• Canary Unhealthy Region Count
18 | Cloudera Operation
Monitoring and Diagnostics
• Canary Unhealthy Region Percentage
6. Click Save Changes to commit the changes.
7. Return to the Home page by clicking the Cloudera Manager logo.
8.
Click the icon that is next to any stale services to invoke the cluster restart wizard.
Configuring Health Alerts
Enabling Health Alerts
You can enable alerts when the health of a role or service crosses a threshold.
1. Select Clusters > cluster_name > service_name or open the page for a role.
2. Click the Configuration tab.
3. Select Scope > role_name or service_name (Service-Wide).
4. Click the Monitoring category.
5. Check the Enable Health Alerts for this Role or Enable Service Level Health Alerts checkbox, depending on whether
you are configuring a role or a service.
6. Click Save Changes to commit the changes.
7. Return to the Home page by clicking the Cloudera Manager logo.
8.
Click the icon that is next to any stale services to invoke the cluster restart wizard.
Modifying the Health Threshold
You can configure the threshold when a health alert is raised.
1. Select Administration > Alerts.
2.
Click to the right of Health Alert Threshold.
3. Select Scope > Event Server.
4. Click the Main category.
5. Select the Bad or Concerning option.
6. Click Save Changes to commit the changes.
7. Return to the Home page by clicking the Cloudera Manager logo.
8.
Click the icon that is next to any stale services to invoke the cluster restart wizard.
Configuring Alerts Transitioning Out of Alerting Health Threshold
You can configure an alert when a service or role instance transitions from an alerting to a non-alerting health threshold.
1. Select Administration > Alerts.
2.
Click to the right of Alert on Transitions out of Alerting Health.
3. Select Scope > role_name or service_name (Service-Wide).
4. In the category Event Server Default Group, check the Alert on Transitions out of Alerting Health checkbox.
5. Click Save Changes to commit the changes.
6. Return to the Home page by clicking the Cloudera Manager logo.
7.
Click the icon that is next to any stale services to invoke the cluster restart wizard.
Configuring Log Alerts
You can configure an alert when a daemon emits a log message that matches a specified regular expression. See
Configuring Log Alerts on page 22.
Configuring Alert Delivery
You can configure alerts to be delivered by email or sent as SNMP traps. If you choose email delivery, you can add to
or modify the list of alert recipient email addresses. You can also send a test alert email. See Managing Alerts.
Cloudera Operation | 19
Monitoring and Diagnostics
Note: If alerting is enabled for events, you can search for and view alerts in the Events tab, even if
you do not have email notification configured.
Configuring Log Events
You can enable or disable the forwarding of selected log events to the Event Server. This is enabled by default, and is
a service-wide setting (Enable Log Event Capture) for each service for which monitoring is provided. You can enable
and disable event capture for CDH services or for the Cloudera Management Service.
Important: We do not recommend logging to a network-mounted file system. If a role is writing its
logs across the network, a network failure or the failure of a remote file system can cause that role
to freeze up until the network recovers.
Configuring Logs
1. Go to a service.
2. Click the Configuration tab.
3. Select role_name (Service-Wide) > Logs .
4. Edit a log property.
5. Click Save Changes to commit the changes.
6. Return to the Home page by clicking the Cloudera Manager logo.
7.
Click the icon that is next to any stale services to invoke the cluster restart wizard.
Configuring Logging Thresholds
A logging threshold determines what level of log message is reported. The available levels are:
• TRACE - Informational events finer-grained than DEBUG.
• DEBUG - Informational events useful to debug an application.
• INFO - Informational events that highlight progress at coarse-grained level.
• WARN - Events that indicate a potential problem which is handled by the application.
• ERROR - Error events that allows the application to continue running.
• FATAL - Very severe error events that typically lead the application to abort.
The number of messages is greater and severity is least for TRACE. The default setting is INFO.
1. Go to a service.
2. Click the Configuration tab.
3. Enter Logging Threshold in the Search text field.
4. For the desired role group, select a logging threshold level.
5. Click Save Changes to commit the changes.
6. Return to the Home page by clicking the Cloudera Manager logo.
7.
Click the icon that is next to any stale services to invoke the cluster restart wizard.
Configuring Log Directories
1. Do one of the following:
• Cluster
1. On the Home > Status tab, click a cluster name.
2. Select Configuration > Log Directories.
3. Edit a role_name Log Directory property.
• Service
20 | Cloudera Operation
Monitoring and Diagnostics
剩余133页未读,继续阅读
qweuytrqoiwerqpoweru
- 粉丝: 1
- 资源: 10
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- AirKiss技术详解:无线传递信息与智能家居连接
- Hibernate主键生成策略详解
- 操作系统实验:位示图法管理磁盘空闲空间
- JSON详解:数据交换的主流格式
- Win7安装Ubuntu双系统详细指南
- FPGA内部结构与工作原理探索
- 信用评分模型解析:WOE、IV与ROC
- 使用LVS+Keepalived构建高可用负载均衡集群
- 微信小程序驱动餐饮与服装业创新转型:便捷管理与低成本优势
- 机器学习入门指南:从基础到进阶
- 解决Win7 IIS配置错误500.22与0x80070032
- SQL-DFS:优化HDFS小文件存储的解决方案
- Hadoop、Hbase、Spark环境部署与主机配置详解
- Kisso:加密会话Cookie实现的单点登录SSO
- OpenCV读取与拼接多幅图像教程
- QT实战:轻松生成与解析JSON数据
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功