RNN未训练(PyTorch)

问题描述:

训练时我无法弄错我做错了RNN。我试图训练RNN对于序列操作(了解它如何在简单的任务上工作)。 但是我的网络没有学习,损失保持不变,并且它不能模拟事件。 你能帮我找到问题吗?RNN未训练(PyTorch)

数据我使用:

data = [ 
    [1, 1, 1, 1, 0, 0, 1, 1, 1], 
    [1, 1, 1, 1], 
    [0, 0, 1, 1], 
    [0, 0, 0, 0, 0, 0, 0], 
    [1, 1, 1, 1, 1, 1, 1], 
    [1, 1], 
    [0], 
    [1], 
    [1, 0]] 
labels = [ 
    0, 
    1, 
    0, 
    0, 
    1, 
    1, 
    0, 
    1, 
    0 
] 

代码为NN:

class AndRNN(nn.Module): 
def __init__(self): 
    super(AndRNN, self).__init__() 
    self.rnn = nn.RNN(1, 10, 5) 
    self.fc = nn.Sequential(
     nn.Linear(10, 30), 
     nn.Linear(30, 2) 
    ) 

def forward(self, input, hidden): 
    x, hidden = self.rnn(input, hidden) 
    x = self.fc(x[-1])   
    return x, hidden 

def initHidden(self): 
    return Variable(torch.zeros((5, 1, 10))) 

训练循环:

criterion = torch.nn.CrossEntropyLoss() 
optimizer = torch.optim.SGD(net.parameters(), lr=0.001, momentum=0.9) 

correct = 0 
for e in range(20): 
for i in range(len(data)): 
    tensor = torch.FloatTensor(data[i]).view(-1, 1, 1) 
    label = torch.LongTensor([labels[i]]) 
    hidden = net.initHidden() 
    optimizer.zero_grad() 

    out, hidden = net(Variable(tensor), Variable(hidden.data)) 

    _, l = torch.topk(out, 1) 
    if label[0] == l[0].data[0]: 
     correct += 1 

    loss = criterion(out, Variable(label)) 
    loss.backward() 
    optimizer.step() 

    print("Loss ", loss.data[0], "Accuracy ", (correct/(i + 1))) 

形状张量将是(sequence_len,1(这是一批大小),1),这是正确的根据PyTorch文档RNN

问题是这一行:

out, hidden = net(Variable(tensor), Variable(hidden.data)) 

应该是简单

out, hidden = net(Variable(tensor), hidden) 

通过具有Variable(hidden.data)在这里,你是在非常创建一个新的hidden_​​state变量(全零)步骤,而不是通过隐藏状态从以前的状态。

我试过你的例子,并将优化器改为Adam。有完整的代码。

class AndRNN(nn.Module): 
    def __init__(self): 
     super(AndRNN, self).__init__() 
     self.rnn = nn.RNN(1, 10, 5) 
     self.fc = nn.Sequential(
      nn.Linear(10, 30), 
      nn.Linear(30, 2) 
     ) 

    def forward(self, input, hidden): 
     x, hidden = self.rnn(input, hidden) 
     x = self.fc(x[-1])   
     return x, hidden 

    def initHidden(self): 
     return Variable(torch.zeros((5, 1, 10))) 

net = AndRNN()  
criterion = torch.nn.CrossEntropyLoss() 
optimizer = torch.optim.Adam(net.parameters()) 

correct = 0 
for e in range(100): 
    for i in range(len(data)): 
     tensor = torch.FloatTensor(data[i]).view(-1, 1, 1) 
     label = torch.LongTensor([labels[i]]) 
     hidden = net.initHidden() 
     optimizer.zero_grad() 

     out, hidden = net(Variable(tensor), hidden) 

     loss = criterion(out, Variable(label)) 
     loss.backward() 
     optimizer.step() 
    if e % 25 == 0: 
     print("Loss ", loss.data[0]) 

结果

Loss 0.6370733976364136 
Loss 0.25336754322052 
Loss 0.006924811284989119 
Loss 0.002351854695007205 
+0

谢谢,看起来合理! –