def affine_forward(x, w, b): out = None N = x.shape[0] x_row = x.reshape(N, -1) out = np.dot(x_row, w) + b.reshape(1, -1) cache = (x, w, b) return out, cacheshapes (40,6272) and (784,100) not aligned: 6272 (dim 1) != 784 (dim 0)
时间: 2023-09-14 14:08:15 浏览: 53
这个错误发生在第3行的 np.dot(x_row, w) 上,因为x和w的维度不兼容。x的形状是(40,6272),w的形状是(784,100),两个矩阵的第二个维度不相等,所以无法进行矩阵乘法。要解决这个问题,需要将w转置成(100,784)的形状,然后再进行矩阵乘法。修改代码如下:
```
def affine_forward(x, w, b):
out = None
N = x.shape[0]
x_row = x.reshape(N, -1)
w_row = w.reshape(w.shape[0], -1).T
out = np.dot(x_row, w_row) + b.reshape(1, -1)
cache = (x, w, b)
return out, cache
```
相关问题
def affine_forward(x, w, b): out = None N = x.shape[0] x_row = x.reshape(N, -1) out = np.dot(x_row, w) + b.reshape(1, -1) cache = (x, w, b) return out, cacheshapes (40,6272) and (784,100) not aligned: 6272 (dim 1) != 784 (dim 0)报错,当修改为def affine_forward(x, w, b): out = None N = x.shape[0] x_row = x.reshape(N, -1) w_row = w.reshape(w.shape[0], -1).T out = np.dot(x_row, w_row) + b.reshape(1, -1) cache = (x, w, b) return out, cache时,shapes (40,6272) and (100,8192) not aligned: 6272 (dim 1) != 100 (dim 0)报错
这个错误发生在第4行的 np.dot(x_row, w_row) 上,因为x_row的第二个维度是6272,而w_row的第一个维度是100,两个维度不相等,无法进行矩阵乘法。根据错误提示,w的形状是(100,8192),应该将x的第二个维度重新设置为8192才能进行矩阵乘法。修改代码如下:
```
def affine_forward(x, w, b):
out = None
N = x.shape[0]
x_row = x.reshape(N, -1)
w_row = w.reshape(w.shape[0], -1).T
out = np.dot(x_row, w_row) + b.reshape(1, -1)
cache = (x, w, b)
return out, cache
```
def affine_forward(x, w, b): out = None N = x.shape[0] x_row = x.reshape(N, -1) w = np.random.randn(784, 100) out = np.dot(x_row, w) + b.reshape(1, -1) cache = (x, w, b) return out, cache mnist识别shapes (40,6272) and (784,100) not aligned: 6272 (dim 1) != 784 (dim 0)如何解决
你需要将第一个矩阵的列数与第二个矩阵的行数相等,才能进行矩阵乘法。在这里,你需要将第一个矩阵的形状改为(40, 784),可以通过x.reshape(N, -1)实现。修改后的代码如下:
```
def affine_forward(x, w, b):
out = None
N = x.shape[0]
x_row = x.reshape(N, -1)
out = np.dot(x_row, w) + b.reshape(1, -1)
cache = (x, w, b)
return out, cache
```
注意,这里将w的值重新赋值为np.random.randn(784, 100)是没有必要的,因为w是作为函数参数传入的,所以不需要重新赋值。