We can now use a method to plot the loss surface of the network by projecting the parameter updates into two dimensions. You can find more information on that here. But you can just use the provided code. The contour plot will show how the loss will change if you would follow the two main directions of the past parameter updates. Think about the challenges and the optimization process of this landscape. What could impede the convergence of the net? # project states onto the main directions of the gradient updates using n samples over all steps starting from sample x # the directions are calculated using the last sample as a reference directions, state_ids, loss_coordinates = get_state_directions(states, n_states=10, start_from=0, reference_id=-1) # compute the losses over the main directions of the gradient updates x, y, Z, _ = get_loss_grid(net, data_loader, loss_fn, directions=directions, resolution=(20, 20), scale=loss_coordinates.abs().max().item()) # plot the landscape as a contour plot fig = plot_contour(np.copy(x), np.copy(y), np.copy(Z), scale=True) fig.add_traces(go.Scatter(x=np.copy(loss_coordinates[0].cpu().numpy()), y=np.copy(loss_coordinates[1].cpu().numpy()))) print('loss samples:', np.array(losses)[state_ids]) conf_pltly() init_notebook_mode(connected=False) iplot(fig) --------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) <ipython-input-62-26d05ea2d790> in <cell line: 3>() 1 # project states onto the main directions of the gradient updates using n samples over all steps starting from sample x 2 # the directions are calculated using the last sample as a reference ----> 3 directions, state_ids, loss_coordinates = get_state_directions(states, n_states=10, start_from=0, reference_id=-1) 4 5 # compute the losses over the main directions of the gradient updates <ipython-input-60-6cc4aad7dcda> in get_state_directions(states, n_states, start_from, reference_id) 15 params.append(param.view(-1)) 16 ---> 17 params = torch.stack(params, dim=0) 18 reference = params[-1] 19 RuntimeError: stack expects each tensor to be equal size, but got [200704] at entry 0 and [256] at entry 1这个错误怎么改
时间: 2024-03-30 09:34:13 浏览: 168
这个错误是由于在进行 torch.stack(params, dim=0) 操作时,params 中的张量维度不一致导致的。可以通过使用 torch.cat() 来代替 torch.stack(),将 params 列表中的张量按照指定维度拼接起来,从而避免这个错误。具体来说,可以将 get_state_directions() 函数中的 params = torch.stack(params, dim=0) 改为 params = torch.cat(params, dim=0)。
阅读全文