1.以下sql,使用了subplan+broadcast, 请根据语义合理优化该sql, 使其运行效率更高效 select * from user01.tb1 t1 where exists (select max(id) from user01.tb2 t2 where t1.name=t2.name); 2.以下SQL, t1表使用了broadcast算子,请使用 hint 优化,避免t1表使用 broadcast select t1,id,t2.id2 from user01.tb1 t1 inner join user01.t_skew t2 on t1.id=t2.id2 and t1.name='beijing' order by 1; 3.如何判断下列语句是否下推,请写出判断方法: select count(t1.*) from user01.tb1 t1 left join user01.tb2 t2 on t1.id=t2.td and t2.name ='beijing' 4.下列语句的执行计划中,优化器选择表他作为hash内表,t2作为hash外表,请使用hint调整执行计划,使t2做hash内表 select t1.id,t2.id2 from user01.tb1 t1 inner join user01.t_skew t2 on t1.id=t2.id2 and t1.name ='beijing' order by 1 5.将schema权限赋予用户user1 将schema s2下所有表的访问权限赋予用户user1 6.gsql开创建数据库usdb,指定字符集utf-8,限制连接数20启时间检查命令 7.创建名为us01的用户,并将sysadmin权限授权给他 8.创建数据库usdb,指定字符集utf-8,限制连接数20 9.下面的语句的执行计划中州优化器选择了nestloop的关联方式,请根据语义修改语句,其实关联方式变为hashion,以提升查询性能 select * from user01.tb1 t1 where t1.id not in(select t2.id from user01.tb2 t2 where t2.name='shanghai'); 改写参考: select * from user01.tb1 t1 where not exists (select 1 from user01.tb2 t2 where t2.name='shanghai' and t1.id=t2.id) and t1.id is not null; 10.1、下面的执行计划存在性能问题,已知该集群有10个dn,且 select reltuples from pg class Where relname='t1'; 结果为123456798765,请根据执行计划分析,判断性能瓶颈,并给出优化建议: id | operation | E-rows | E-memory | E-width | E-costs ----+-------------------------------+-----------+---------------+-----------|---------- 1 |->Streaming (type: GATHER) | 100 | | 12 | 114.24 2 | ->Hash Join(3,4) | 100 | 1MB | 12 | 111.05 3 | ->Seq Scan on t2 | 100000 | 1MB | 4 | 91.50 4 | ->Hash | 96 | 16MB | 8 | 3.56 5 | ->Streaming(type: BROADCAST) |100| 2MB | 8 | 3.56 6 | ->Seg Scan on t1 | 100 | 1MB | 8 | 3.06
时间: 2024-04-22 20:28:04 浏览: 176
云原生虚拟数仓PieCloudDB的架构和关键模块实现.pptx
1. 对于这个SQL语句,可以考虑使用子查询替换exists子句,并将inner join替换为left join。优化后的SQL如下:
```sql
select t1.*
from user01.tb1 t1
left join (
select max(id) as max_id, name
from user01.tb2
group by name
) t2 on t1.name = t2.name and t1.id = t2.max_id;
```
2. 可以使用`NO_BROADCAST_JOIN` hint来避免t1表使用broadcast算子。优化后的SQL如下:
```sql
select /*+ NO_BROADCAST_JOIN(t1) */ t1.id, t2.id2
from user01.tb1 t1
inner join user01.t_skew t2 on t1.id = t2.id2 and t1.name = 'beijing'
order by 1;
```
3. 判断是否下推可以通过查看执行计划中join类型来确定。如果join类型是LEFT JOIN,且右表的过滤条件出现在执行计划的Filter节点中,则说明该条件未下推。如果右表的过滤条件出现在执行计划的Join节点中,则说明该条件已下推。
4. 可以使用`LEADING(t2)` hint来指示优化器选择t2作为hash内表。优化后的SQL如下:
```sql
select t1.id, t2.id2
from user01.tb1 t1
inner join /*+ LEADING(t2) */ user01.t_skew t2 on t1.id = t2.id2 and t1.name = 'beijing'
order by 1;
```
5. 赋予用户user1 schema权限的SQL如下:
```sql
GRANT USAGE ON SCHEMA schema_name TO user1;
```
赋予用户user1 schema s2下所有表的访问权限的SQL如下:
```sql
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA s2 TO user1;
```
6. 创建数据库usdb的SQL如下:
```sql
CREATE DATABASE usdb WITH ENCODING 'UTF8' CONNECTION LIMIT 20;
```
7. 创建用户us01并授予sysadmin权限的SQL如下:
```sql
CREATE USER us01;
GRANT sysadmin TO us01;
```
8. 创建数据库usdb的SQL如下:
```sql
CREATE DATABASE usdb WITH ENCODING 'UTF8' CONNECTION LIMIT 20;
```
9. 改写后的SQL如下:
```sql
select t1.*
from user01.tb1 t1
left join (
select id
from user01.tb2
where name = 'shanghai'
) t2 on t1.id = t2.id
where t2.id is null and t1.id is not null;
```
10.1、根据执行计划分析,性能瓶颈可能在Hash Join操作上,它的成本较高。建议可以考虑增加内存大小,以减少磁盘I/O的开销,或者调整查询语句,使用索引来加速查询。另外,也可以考虑对数据进行分区或分片,以提高查询效率。
阅读全文