There might a bug in the implementation of PolicyNet #8

alvinsay · 2022-12-16T00:49:13Z

According to the AlphaGo Zero cheat sheet from this article

In the Policy Head, the input tensor will be convoluted with two filters to 2 channels (2x19x19), and then use FC decoder to output a 19x19 + 1 vector.

The code of PolicyNet will be

class PolicyNet(nn.Module):
    def __init__(self, inplanes, outplanes):
        super(PolicyNet, self).__init__()
        self.outplanes = outplanes
        # convoluted to 2 planes
        self.conv = nn.Conv2d(inplanes, 2, kernel_size=1)
        self.bn = nn.BatchNorm2d(1)
        self.logsoftmax = nn.LogSoftmax(dim=1)
        # NxN = 19x19 = outplanes -1
        # The FC will decode input from 2x19x19 to 19x19 + 1
        self.fc_input_size = 2*(outplanes-1)
        self.fc = nn.Linear(self.fc_input_size, outplanes)
        self.af1 = nn.ReLU()
        
    def forward(self, x):
        x = self.af1(self.bn(self.conv(x)))
        x = x.view(-1, self.fc_input_size)
        x = self.fc(x)
        probas = self.logsoftmax(x).exp()
        return probas

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

There might a bug in the implementation of PolicyNet #8

There might a bug in the implementation of PolicyNet #8

alvinsay commented Dec 16, 2022

There might a bug in the implementation of PolicyNet #8

There might a bug in the implementation of PolicyNet #8

Comments

alvinsay commented Dec 16, 2022