tsne <- Rtsne(as.matrix(optdigits[,1:3]), check_duplicates = FALSE, pca = + FALSE, perplexity=30, theta=0.5, dims=2) Error in na.fail.default(X) : missing values in object

这个错误提示是因为你的数据集中包含缺失值。你需要先处理缺失值，然后再进行 t-SNE 分析。你可以使用 `complete.cases()` 函数来删除包含缺失值的行，例如： ``` optdigits <- optdigits[complete.cases(optdigits),] ``` 这将会删除包含缺失值的行。然后再运行 t-SNE 分析即可。

tsne <- Rtsne(as.matrix(optdigits[,1:3]), check_duplicates = FALSE, pca = FALSE, perplexity=30, theta=0.5, dims=2)为什么做出的图是空白的

可能有几个原因导致您的图是空白的: 1. 数据格式问题：请确保您选择的前三列数据是数字型变量。如果前三列是字符型变量或其他类型数据，将会导致tsne函数无法正常运行并生成空白图。 2. 参数设置问题：请检查您在tsne函数中设置的参数是否适当。其中，perplexity参数是调整t-SNE算法的重要参数，它控制了每个点周围的邻域大小。如果您选择的perplexity值过大或过小，则可能导致图形无法显示。 3. 计算量问题：如果数据集较大，则t-SNE算法的计算量会非常大，可能需要一些时间才能完成计算。在此过程中，您可能需要等待一段时间才能看到结果。建议您先检查数据格式是否正确，并尝试更改参数设置，以及在计算过程中耐心等待。如果问题仍然存在，请提供更多细节信息，以便我能够更好地帮助您解决问题。

优化代码 def cluster_format(self, start_time, end_time, save_on=True, data_clean=False, data_name=None): """ local format function is to format data from beihang. :param start_time: :param end_time: :return: """ # 户用簇级数据清洗 if data_clean: unused_index_col = [i for i in self.df.columns if 'Unnamed' in i] self.df.drop(columns=unused_index_col, inplace=True) self.df.drop_duplicates(inplace=True, ignore_index=True) self.df.reset_index(drop=True, inplace=True) dupli_header_lines = np.where(self.df['sendtime'] == 'sendtime')[0] self.df.drop(index=dupli_header_lines, inplace=True) self.df = self.df.apply(pd.to_numeric, errors='ignore') self.df['sendtime'] = pd.to_datetime(self.df['sendtime']) self.df.sort_values(by='sendtime', inplace=True, ignore_index=True) self.df.to_csv(data_name, index=False) # 调用基本格式化处理 self.df = super().format(start_time, end_time) module_number_register = np.unique(self.df['bat_module_num']) # if registered m_num is 0 and not changed, there is no module data if not np.any(module_number_register): logger.logger.warning("No module data!") sys.exit() if 'bat_module_voltage_00' in self.df.columns: volt_ref = 'bat_module_voltage_00' elif 'bat_module_voltage_01' in self.df.columns: volt_ref = 'bat_module_voltage_01' elif 'bat_module_voltage_02' in self.df.columns: volt_ref = 'bat_module_voltage_02' else: logger.logger.warning("No module data!") sys.exit() self.df.dropna(axis=0, subset=[volt_ref], inplace=True) self.df.reset_index(drop=True, inplace=True) self.headers = list(self.df.columns) # time duration of a cluster self.length = len(self.df) if self.length == 0: logger.logger.warning("After cluster data clean, no effective data!") raise ValueError("No effective data after cluster data clean.") self.cluster_stats(save_on) for m in range(self.mod_num): print(self.clusterid, self.mod_num) self.module_list.append(np.unique(self.df[f'bat_module_sn_{str(m).zfill(2)}'].dropna())[0])

Here are some possible optimizations for the given code: 1. Instead of using a list comprehension to find columns with 'Unnamed' in their names, you can use the `filter()` function along with a lambda function to achieve the same result in a more concise way: ``` unused_index_col = list(filter(lambda x: 'Unnamed' in x, self.df.columns)) ``` 2. Instead of dropping duplicates and resetting the index separately, you can use the `drop_duplicates()` function with the `ignore_index` parameter set to `True` to achieve both in one step: ``` self.df.drop_duplicates(inplace=True, ignore_index=True) ``` 3. Instead of using `sys.exit()` to terminate the program when there is no module data, you can raise a `ValueError` with an appropriate error message: ``` raise ValueError("No module data!") ``` 4. Instead of using a series of `if` statements to find the voltage reference column, you can use the `loc` accessor with a boolean mask to select the first column that starts with 'bat_module_voltage': ``` volt_ref_col = self.df.columns[self.df.columns.str.startswith('bat_module_voltage')][0] ``` 5. Instead of using a loop to append a single item to a list, you can use the `append()` method directly: ``` self.module_list.append(np.unique(self.df[f'bat_module_sn_{str(m).zfill(2)}'].dropna())[0]) ``` By applying these optimizations, the code can become more concise and efficient.

阅读全文

tsne <- Rtsne(as.matrix(optdigits[,1:3]), check_duplicates = FALSE, pca = + FALSE, perplexity=30, theta=0.5, dims=2) Error in na.fail.default(X) : missing values in object

tsne <- Rtsne(as.matrix(optdigits[,1:3]), check_duplicates = FALSE, pca = FALSE, perplexity=30, theta=0.5, dims=2)为什么做出的图是空白的

相关推荐

简化浏览器操作：自动化collapse_duplicates脚本安装指南

yarn-deduplicate：解决yarn.lock文件重复依赖的工具

高效查找数组重复元素技巧：使用unordered_map

pandas.cut与pandas.qcut详解：使用技巧与差异

Node.js工具dupe-images精准定位及移除重复图片

036GraphTheory(图论) matlab代码.rar

026SVM用于分类时的参数优化，粒子群优化算法，用于优化核函数的c,g两个参数(SVM PSO)Matlab代码.rar

药店管理-JAVA-基于springBoot的药店管理系统的设计与实现（毕业论文+开题）

【网络】基于matlab高动态网络拓扑中OSPF网络计算【含Matlab源码 10964期】.zip

今天吴老师上课的时候说我.txt

大家在看

MOOC工程伦理课后习题答案（主观+判断+选择）期末考试答案.docx

UD18415B_海康威视信息发布终端_快速入门指南_V1.1_20200302.pdf

一种应用于AMOLED的阵列扫描控制电路 (2011年)

基2，8点DIT-FFT，三级流水线verilog实现

Multisim里的NPN三极管参数资料大全.docx

最新推荐

036GraphTheory(图论) matlab代码.rar

026SVM用于分类时的参数优化，粒子群优化算法，用于优化核函数的c,g两个参数(SVM PSO)Matlab代码.rar

药店管理-JAVA-基于springBoot的药店管理系统的设计与实现（毕业论文+开题）

macOS 10.9至10.13版高通RTL88xx USB驱动下载

PyCharm开发者必备：提升效率的Python环境管理秘籍

matlab中VBA指令集

在Windows Forms和WPF中实现FontAwesome-4.7.0图形

【Postman进阶秘籍】：解锁高级API测试与管理的10大技巧

ubuntu22.04怎么恢复出厂设置

2001年度广告运作规划：高效利用资源的策略