国内某零售企业销售数据有如下6张数据表,数据文件中字段之间以'\t'分隔 (1)门店信息表store,对应数据文件store.txt (2)顾客基本信息表customer,对应数据文件customer.txt (4)销售订单明细表sale_goods,对应数据文件sale_goods.txt (5)商品类别表category,对应数据文件category.txt 门店信息表store,对应数据文件store.txt 字段名 数据类型 中文描述 store_id string 零售商内部门店ID store_name string 门店名称 请根据以上数据表结构完成以下操作。 数据准备:(50分) 1、创建数据库sale并使用该数据库(10分) 2、创建上面给出的6张数据表(20分) 3、向6张表中加载数据(10分) 4、进行数据清洗,去除sale_goods表中的异常数据,只保留2020-06-01到2021-05-14的数据(10分)
时间: 2024-03-22 08:40:58 浏览: 70
以下是针对问题的解答:
1、创建数据库sale并使用该数据库:
```mysql
CREATE DATABASE sale;
USE sale;
```
2、创建6张数据表:
```mysql
CREATE TABLE store (
store_id VARCHAR(20) PRIMARY KEY,
store_name VARCHAR(50) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
CREATE TABLE customer (
customer_id VARCHAR(20) PRIMARY KEY,
customer_name VARCHAR(50) NOT NULL,
gender VARCHAR(10) NOT NULL,
age INT NOT NULL,
phone_number VARCHAR(20)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
CREATE TABLE category (
category_id VARCHAR(20) PRIMARY KEY,
category_name VARCHAR(50) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
CREATE TABLE goods (
goods_id VARCHAR(20) PRIMARY KEY,
goods_name VARCHAR(50) NOT NULL,
category_id VARCHAR(20) NOT NULL,
price DECIMAL(10, 2) NOT NULL,
FOREIGN KEY (category_id) REFERENCES category (category_id) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
CREATE TABLE sale_order (
order_id VARCHAR(20) PRIMARY KEY,
store_id VARCHAR(20) NOT NULL,
customer_id VARCHAR(20) NOT NULL,
order_date DATE NOT NULL,
order_amount DECIMAL(10, 2) NOT NULL,
FOREIGN KEY (store_id) REFERENCES store (store_id) ON DELETE CASCADE ON UPDATE CASCADE,
FOREIGN KEY (customer_id) REFERENCES customer (customer_id) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
CREATE TABLE sale_goods (
sale_id VARCHAR(20) PRIMARY KEY,
order_id VARCHAR(20) NOT NULL,
goods_id VARCHAR(20) NOT NULL,
quantity INT NOT NULL,
amount DECIMAL(10, 2) NOT NULL,
FOREIGN KEY (order_id) REFERENCES sale_order (order_id) ON DELETE CASCADE ON UPDATE CASCADE,
FOREIGN KEY (goods_id) REFERENCES goods (goods_id) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
```
3、向6张表中加载数据:
```mysql
LOAD DATA INFILE '/path/to/store.txt' INTO TABLE store CHARACTER SET utf8mb4 FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' IGNORE 1 ROWS;
LOAD DATA INFILE '/path/to/customer.txt' INTO TABLE customer CHARACTER SET utf8mb4 FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' IGNORE 1 ROWS;
LOAD DATA INFILE '/path/to/category.txt' INTO TABLE category CHARACTER SET utf8mb4 FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' IGNORE 1 ROWS;
LOAD DATA INFILE '/path/to/goods.txt' INTO TABLE goods CHARACTER SET utf8mb4 FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' IGNORE 1 ROWS;
LOAD DATA INFILE '/path/to/sale_order.txt' INTO TABLE sale_order CHARACTER SET utf8mb4 FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' IGNORE 1 ROWS;
LOAD DATA INFILE '/path/to/sale_goods.txt' INTO TABLE sale_goods CHARACTER SET utf8mb4 FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n' IGNORE 1 ROWS;
```
4、进行数据清洗,去除sale_goods表中的异常数据,只保留2020-06-01到2021-05-14的数据:
```mysql
DELETE FROM sale_goods WHERE order_id IN (
SELECT order_id FROM sale_order WHERE order_date < '2020-06-01' OR order_date > '2021-05-14'
);
```
阅读全文