Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

There might a bug in the implementation of PolicyNet #8

Open
alvinsay opened this issue Dec 16, 2022 · 0 comments
Open

There might a bug in the implementation of PolicyNet #8

alvinsay opened this issue Dec 16, 2022 · 0 comments

Comments

@alvinsay
Copy link

According to the AlphaGo Zero cheat sheet from this article

image

In the Policy Head, the input tensor will be convoluted with two filters to 2 channels (2x19x19), and then use FC decoder to output a 19x19 + 1 vector.

The code of PolicyNet will be

class PolicyNet(nn.Module):
    def __init__(self, inplanes, outplanes):
        super(PolicyNet, self).__init__()
        self.outplanes = outplanes
        # convoluted to 2 planes
        self.conv = nn.Conv2d(inplanes, 2, kernel_size=1)
        self.bn = nn.BatchNorm2d(1)
        self.logsoftmax = nn.LogSoftmax(dim=1)
        # NxN = 19x19 = outplanes -1
        # The FC will decode input from 2x19x19 to 19x19 + 1
        self.fc_input_size = 2*(outplanes-1)
        self.fc = nn.Linear(self.fc_input_size, outplanes)
        self.af1 = nn.ReLU()
        
    def forward(self, x):
        x = self.af1(self.bn(self.conv(x)))
        x = x.view(-1, self.fc_input_size)
        x = self.fc(x)
        probas = self.logsoftmax(x).exp()
        return probas
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant