你是一名数据质量工程师,请找出5种数据完整性的打分方式并写出SQL
时间: 2023-07-15 12:10:55 浏览: 52
以下是5种数据完整性的打分方式及相应的SQL示例:
1. 数据完整性百分比打分法
该方法计算数据表中每个字段的完整性百分比,即该字段非空值占总记录数的比例。
SQL示例:
```
SELECT
COUNT(CASE WHEN column1 IS NOT NULL THEN 1 END) / COUNT(*) * 100 AS column1_completeness_score,
COUNT(CASE WHEN column2 IS NOT NULL THEN 1 END) / COUNT(*) * 100 AS column2_completeness_score,
COUNT(CASE WHEN column3 IS NOT NULL THEN 1 END) / COUNT(*) * 100 AS column3_completeness_score
FROM table_name;
```
2. 缺失值数量打分法
该方法计算数据表中每个字段的缺失值数量,并将其转化为打分,缺失值越多,得分越低。
SQL示例:
```
SELECT
100 - COUNT(column1) AS column1_completeness_score,
100 - COUNT(column2) AS column2_completeness_score,
100 - COUNT(column3) AS column3_completeness_score
FROM table_name;
```
3. 数据类型一致性打分法
该方法计算数据表中每个字段的数据类型一致性,即该字段的数据类型是否与定义的数据类型一致,一致则得分高,不一致则得分低。
SQL示例:
```
SELECT
CASE WHEN data_type(column1) = 'varchar' THEN 100 ELSE 0 END AS column1_completeness_score,
CASE WHEN data_type(column2) = 'int' THEN 100 ELSE 0 END AS column2_completeness_score,
CASE WHEN data_type(column3) = 'date' THEN 100 ELSE 0 END AS column3_completeness_score
FROM table_name;
```
4. 数据范围完整性打分法
该方法计算数据表中每个字段的数据范围完整性,即该字段是否包含了定义的数据范围内的所有值,包含则得分高,不包含则得分低。
SQL示例:
```
SELECT
COUNT(DISTINCT column1) / (SELECT COUNT(*) FROM table_name) * 100 AS column1_completeness_score,
COUNT(DISTINCT column2) / (SELECT COUNT(*) FROM table_name) * 100 AS column2_completeness_score,
COUNT(DISTINCT column3) / (SELECT COUNT(*) FROM table_name) * 100 AS column3_completeness_score
FROM table_name;
```
5. 数据关联完整性打分法
该方法计算数据表中不同字段之间的关联完整性,即是否存在跨字段的关联性,存在则得分高,不存在则得分低。
SQL示例:
```
SELECT
CASE
WHEN COUNT(DISTINCT column1) = COUNT(*) THEN 100
ELSE 0
END AS column1_completeness_score,
CASE
WHEN COUNT(DISTINCT column2) = COUNT(*) THEN 100
ELSE 0
END AS column2_completeness_score,
CASE
WHEN COUNT(DISTINCT column3) = COUNT(*) THEN 100
ELSE 0
END AS column3_completeness_score
FROM table_name;
```