![](https://csdnimg.cn/release/download_crawler_static/86311645/bg5.jpg)
输出标准化后数据的箱线图。
fitness <- data.frame(scale(fitness))#标准化
na <- c(colnames(fitness))
boxplot(fitness$Age, fitness$Weight, fitness$Oxygen, fitness$
RunTime, fitness$RestPulse, fitness$RunPulse, fitness$
MaxPulse, main = "Box␣plot", names = na)
画出的箱线图如下图所示:
图 1: 箱线图
3.1.3 数据清洗
由上面的分析了解到数据中有离群点,下面对离群点数据进行剔除。
outlier_location <- sapply(fitness, function(X){which(X%in%
boxplot.stats(X)$out)})#找出异常值的位置
todel <- (sort(unique(unlist(outlier_location))))#求并
fitness <- fitness[-todel, ]#剔除离群点
boxplot(fitness$Age, fitness$Weight, fitness$Oxygen, fitness$
RunTime, fitness$RestPulse, fitness$RunPulse, fitness$
MaxPulse, main = "Box␣plot", names = na)
剔除异常值后画出的箱线图如下图所示:
3