Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tutorial 7 Bug In D^{-1/2} A D^{-1/2} Math? #154

Open
johanos opened this issue Jan 8, 2025 · 0 comments
Open

Tutorial 7 Bug In D^{-1/2} A D^{-1/2} Math? #154

johanos opened this issue Jan 8, 2025 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@johanos
Copy link

johanos commented Jan 8, 2025

Tutorial: 7

Describe the bug
I think there's a math error for this claim "Finally, to take the average instead of summing, we calculate the matrix 𝐷̂ which is a diagonal matrix with 𝐷𝑖𝑖 denoting the number of neighbors node 𝑖 has."

I read this and couldn't get this to be an average. When I did the math out I got this result

$$\left( D^{-1/2}AD^{-1/2}\right)_{ij} = \frac{A_{ij}}{\sqrt{D_{ii}} \sqrt{D_{jj}}}$$

which got me to try to implement the math directly instead of just averaging the rows to see

class GCNLayer(nn.Module):
    
    def __init__(self, c_in, c_out):
        super().__init__()
        self.projection = nn.Linear(c_in, c_out)

    def forward(self, node_feats, adj_matrix):
        """
        Inputs:
            node_feats - Tensor with node features of shape [batch_size, num_nodes, c_in]
            adj_matrix - Batch of adjacency matrices of the graph. If there is an edge from i to j, adj_matrix[b,i,j]=1 else 0.
                         Supports directed edges by non-symmetric matrices. Assumes to already have added the identity connections. 
                         Shape: [batch_size, num_nodes, num_nodes]
        """
        # Num neighbours = number of incoming edges
        num_neighbours = adj_matrix.sum(dim=-1, keepdims=True)

        # Create the diagonal matrix D^{-1/2}
        # Avoid division by zero
        num_neighbours = torch.clamp(num_neighbours, min=1) 
        d_inv_sqrt = torch.pow(num_neighbours, -0.5)

        # Create a diagonal matrix from d_inv_sqrt
        identity = torch.eye(num_neighbours.shape[1]).to(node_feats.device)  
        d_inv_sqrt_matrix = identity * d_inv_sqrt


        # Perform D^{-1/2} adj_matrix D^{-1/2}
        adj_matrix = torch.matmul(d_inv_sqrt_matrix, adj_matrix)  
        adj_matrix = torch.matmul(adj_matrix, d_inv_sqrt_matrix)
        
        node_feats = self.projection(node_feats)
        node_feats = torch.bmm(adj_matrix, node_feats)  

        return node_feats

which yielded results that still line up with the claims presented afterwards, mainly that the graph nodes would output the same values because the set of nodes is the same.

Adjacency matrix
 tensor([[[1., 1., 0., 0.],
         [1., 1., 1., 1.],
         [0., 1., 1., 1.],
         [0., 1., 1., 1.]]])
Input features
 tensor([[[0., 1.],
         [2., 3.],
         [4., 5.],
         [6., 7.]]])
Output features
 tensor([[[0.7071, 1.5607],
         [3.3868, 4.5677],
         [3.9107, 4.8660], <- this node output is the same 
         [3.9107, 4.8660]]]) <- as this node, which was the point of the example. 

So i guess the main thing is that the 'averaging' claim is not fully on the nose, it's more that it's getting scaled by some pairwise factor.

$$D_{ii}^{-1/2} D_{jj}^{-1/2} * A_{ij}$$

at least that's my intuition of the situation

Runtime environment (please complete the following information):
Colab - GPU

@johanos johanos added the bug Something isn't working label Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants