1 Simulate a dataset containing continuous traits with given ℎ 2 . Split it into training and testing datasets with proportion 4 ∶ 1. set.seed(20132014) n <- 5000 p <- 1000 h <- c(0.2, 0.8)[1] # simulate genotype (not exactly) x_r <- matrix(rnorm(n * p), ncol = p) xmean <- matrix(rep(colMeans(x_r), n), ncol=p, byrow = TRUE) xsd <- matrix(rep(apply(x_r, 2, sd), n), ncol=p, byrow = TRUE) x <- (x_r - xmean)/xsd # ... 2 4 Calculate polygenic scores with both the marginnal and joint models on testing data. And compare their performance under low and high ℎ
时间: 2024-03-26 22:35:07 浏览: 11
Sure, here's the code to calculate polygenic scores with both the marginal and joint models on the testing data:
```R
# perform Lasso on training set
fit_joint <- glmnet(x_train, y_train, alpha = 1)
# calculate polygenic scores on testing set using joint model
ps_joint <- x_test %*% fit_joint$beta[,1]
# perform simple linear regression on training set
fit_marginal <- lm(y_train ~ x_train)
# calculate polygenic scores on testing set using marginal model
ps_marginal <- x_test %*% coef(fit_marginal)[-1]
# compare performance under low and high h^2
h_low <- c(0.2, 0.8)[1]
h_high <- c(0.2, 0.8)[2]
# calculate correlation between true and predicted phenotype for joint model (low h^2)
cor_joint_low <- cor(y_test[h == h_low], ps_joint[h == h_low])
# calculate correlation between true and predicted phenotype for marginal model (low h^2)
cor_marginal_low <- cor(y_test[h == h_low], ps_marginal[h == h_low])
# calculate correlation between true and predicted phenotype for joint model (high h^2)
cor_joint_high <- cor(y_test[h == h_high], ps_joint[h == h_high])
# calculate correlation between true and predicted phenotype for marginal model (high h^2)
cor_marginal_high <- cor(y_test[h == h_high], ps_marginal[h == h_high])
```
To compare the performance of the two models under low and high h^2, we calculated the correlation between the true and predicted phenotype for each model. The correlation for the joint model was calculated using the polygenic scores calculated with the Lasso model, and the correlation for the marginal model was calculated using the polygenic scores calculated with simple linear regression.
You can compare the performance by looking at the values of `cor_joint_low`, `cor_marginal_low`, `cor_joint_high`, and `cor_marginal_high`. The higher the correlation, the better the model's performance at predicting the phenotype.
I hope this helps! Let me know if you have any further questions.