解释一下这行代码:sapply(x=strsplit(colnames(pbmc),split = '_'),Fun='[',1)
时间: 2024-01-18 16:04:28 浏览: 29
这行代码使用了R语言中的sapply函数,对pbmc数据集中的列名进行了操作。具体来说,代码中用strsplit函数将列名按照下划线分割成多个字符串,并将这些字符串保存在一个列表中。然后使用'['操作符,取出每个字符串列表中的第一个元素,也就是每个列名中下划线前面的部分,最后将这些元素组成一个向量并返回。换句话说,这行代码的作用是提取pbmc数据集中每个列名中下划线前面的部分,返回一个向量。
相关问题
id = toupper(sapply(strsplit(srcFile, '[_.]'), '[[', 1)) 翻译成python
代码翻译成 Python 的形式为:
id = ''.join([word.split('.')[].split('_')[].upper() for word in srcFile.split('/')])
brier_efron <- function(y_train_true, y_train_pred, y_newdata, y_newdata_pred, times){ baseline <- base_efron(y_train_true, y_train_pred) y_newdata <- data.frame(y_newdata) colnames(y_newdata) = c("time","event") new_index <- order(y_newdata$time) y_newdata <- y_newdata[new_index,] y_newdata_pred <- y_newdata_pred[new_index,] Y_x = sapply(times, function(x){as.integer(y_newdata$time > x)}) ipcw <- pec::ipcw(formula = as.formula(Surv(time, event) ~ 1), data = y_newdata, method = "marginal", times = times, subjectTimes = y_newdata$time, subjectTimesLag = 1) G_t = ipcw$IPCW.times G_x = ipcw$IPCW.subjectTimes W_x = matrix(NA, nrow = nrow(y_newdata), ncol = length(times)) for (t in 1:length(times)) { W_x[,t] = (1-Y_x[,t])*y_newdata$event/G_x + Y_x[,t]/G_t[t] } Lambda_t = sapply(times, function(x){baseline$cumhazard$hazard[sum(baseline$cumhazard$time <= x)] }) S_x = exp(-1 * exp(y_newdata_pred) %*% matrix(Lambda_t, nrow = 1)) BS_t = sapply(1:length(times), function(x) {mean(W_x[,x] * (Y_x[,x] - S_x[,x])^2)}) return(list(bs = data.frame(time=times, bs=BS_t))) } 改成python代码
以下是将R代码转换为Python代码的版本:
```python
import numpy as np
import pandas as pd
from lifelines import CoxPHFitter
from lifelines.utils import concordance_index
from lifelines import KaplanMeierFitter
def brier_efron(y_train_true, y_train_pred, y_newdata, y_newdata_pred, times):
baseline = base_efron(y_train_true, y_train_pred)
y_newdata = pd.DataFrame(y_newdata, columns=["time", "event"])
y_newdata = y_newdata.sort_values(by="time")
y_newdata_pred = y_newdata_pred[y_newdata.index]
Y_x = np.zeros((len(y_newdata), len(times)))
for i, x in enumerate(times):
Y_x[:, i] = np.where(y_newdata["time"] > x, 1, 0)
kmf = KaplanMeierFitter().fit(y_newdata["time"], event_observed=y_newdata["event"])
G_t = kmf.survival_function_at_times(times).values.flatten()
G_x = kmf.predict(y_newdata["time"]).values.flatten()
W_x = np.zeros((len(y_newdata), len(times)))
for t in range(len(times)):
W_x[:, t] = (1 - Y_x[:, t]) * y_newdata["event"] / G_x + Y_x[:, t] / G_t[t]
Lambda_t = [baseline.baseline_hazard_[baseline.baseline_survival_.index(baseline.baseline_survival_.index <= x)].sum() for x in times]
S_x = np.exp(-np.exp(y_newdata_pred) @ np.array(Lambda_t))
BS_t = [np.mean(W_x[:, x] * (Y_x[:, x] - S_x[:, x])**2) for x in range(len(times))]
return pd.DataFrame({"time": times, "bs": BS_t})
```
请注意,这个转换过程中使用了lifelines库,该库提供了在Python中进行生存分析的功能。因此,你需要在运行代码之前确保已安装lifelines库。另外,请注意在Python中的一些细微差异,例如使用`pd.DataFrame`来创建数据框,使用`@`来进行矩阵乘法等。
相关推荐
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)