Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot find the contraction order for CPU version. #506

Open
hunghaoti opened this issue Nov 10, 2024 · 6 comments
Open

Cannot find the contraction order for CPU version. #506

hunghaoti opened this issue Nov 10, 2024 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@hunghaoti
Copy link
Collaborator

Screenshot from 2024-11-10 13-18-04
The optimal order of the tensor network as the attached figure cannot be found for CPU version, but can be found in GPU version (based on cuQuantum, and all of the UniTensor(s) are defined on gpu device). The following code try to get the optimal order but it cannot finished the run in 1 day.

import sys,os
sys.path.insert(0,'/home/hunghaoti/Cytnx_lib/')
import numpy as np
import cytnx

chi = 8 
chi_int = chi 
chi_bd = chi 
d = 2 
# C
C1 = cytnx.UniTensor(cytnx.random.normal([chi_bd, chi_bd],0.,1.))
C2 = cytnx.UniTensor(cytnx.random.normal([chi_bd, chi_bd],0.,1.))
C3 = cytnx.UniTensor(cytnx.random.normal([chi_bd, chi_bd],0.,1.))
C4 = cytnx.UniTensor(cytnx.random.normal([chi_bd, chi_bd],0.,1.))
# A, B
A1 = cytnx.UniTensor(cytnx.random.normal([chi_int, chi_int, chi_int, chi_int, d],0.,1.))
A2 = cytnx.UniTensor(cytnx.random.normal([chi_int, chi_int, chi_int, chi_int, d],0.,1.))
B1 = cytnx.UniTensor(cytnx.random.normal([chi_int, chi_int, chi_int, chi_int, d],0.,1.))
B2 = cytnx.UniTensor(cytnx.random.normal([chi_int, chi_int, chi_int, chi_int, d],0.,1.))
A1T = A1.Conj()
A2T = A2.Conj()
B1T = B1.Conj()
B2T = B2.Conj()
# T
T1a = cytnx.UniTensor(cytnx.random.normal([chi_bd, chi_bd, chi_int, chi_int],0.,1.))
T1b = cytnx.UniTensor(cytnx.random.normal([chi_bd, chi_bd, chi_int, chi_int],0.,1.))
T2a = cytnx.UniTensor(cytnx.random.normal([chi_bd, chi_bd, chi_int, chi_int],0.,1.))
T2b = cytnx.UniTensor(cytnx.random.normal([chi_bd, chi_bd, chi_int, chi_int],0.,1.))
T3a = cytnx.UniTensor(cytnx.random.normal([chi_bd, chi_bd, chi_int, chi_int],0.,1.))
T3b = cytnx.UniTensor(cytnx.random.normal([chi_bd, chi_bd, chi_int, chi_int],0.,1.))
T4a = cytnx.UniTensor(cytnx.random.normal([chi_bd, chi_bd, chi_int, chi_int],0.,1.))
T4b = cytnx.UniTensor(cytnx.random.normal([chi_bd, chi_bd, chi_int, chi_int],0.,1.))
# O
O = cytnx.UniTensor(cytnx.random.normal([d, d, d, d],0.,1.))

anet = cytnx.Network("iPEPS_observe.net")

anet.PutUniTensors(\
        ["C1","C2","C3","C4","A1","A2","B1","B2","A1T","A2T","B1T","B2T","T1a","T1b"\
        ,"T2a","T2b","T3a","T3b","T4a","T4b","O"],\
        [C1,C2,C3,C4,A1,A2,B1,B2,A1T,A2T,B1T,B2T,T1a,T1b,T2a,T2b,T3a,T3b,T4a,T4b,O])
anet.setOrder(optimal=True)
# Print the current contraction order
print(anet.getOrder())
#res = anet.Launch()
#res.print_diagram()

The network file 'iPEPS_observe.net':

C1: 1, 2
C2: 34, 35
C3: 41, 42
C4: 19, 16
T1b: 1, 20, 3, 4
T1a: 20, 34, 21, 22
T2a: 35, 38, 36, 37
T2b: 38, 41, 39, 40
T3b: 42, 33, 31, 32
T3a: 33, 19, 17, 18
T4a: 16, 13, 14, 15
T4b: 13, 2, 5, 6
A1: 4, 24, 10, 6, 8
A1T: 3, 23, 9, 5, 7
B1: 22, 37, 27, 24, 25
B1T: 21, 36, 26, 23, 25
A2: 27, 40, 32, 29, 30
A2T: 26, 39, 31, 28, 30
B2: 10, 29, 18, 15, 12
B2T: 9, 28, 17, 14, 11
O: 7, 8, 11, 12
TOUT:
@yingjerkao
Copy link
Collaborator

Is it possible for the CPU version to print debugging informations?

@manuschneider manuschneider added the bug Something isn't working label Nov 11, 2024
@pcchen
Copy link
Collaborator

pcchen commented Nov 13, 2024

For Fig.12 (of SciPost paper)

import cytnx
import numpy as np
net = cytnx.Network()
net.FromString(["c0: t0-c0, t3-c0",\
                "c1: t1-c1, t0-c1",\
                "c2: t2-c2, t1-c2",\
                "c3: t3-c3, t2-c3",\
                "t0: t0-c1, w-t0, t0-c0",\
                "t1: t1-c2, w-t1, t1-c1",\
                "t2: t2-c3, w-t2, t2-c2",\
                "t3: t3-c0, w-t3, t3-c3",\
                "w: w-t0, w-t1, w-t2, w-t3",\
                "TOUT:",\
                "ORDER: ((((((((c0,t0),c1),t3),w),t1),c3),t2),c2)"])

chi = 2
chi_int = chi 
chi_bd = chi 
# c
c0 = cytnx.UniTensor(cytnx.random.normal([chi_bd, chi_bd],0.,1.))
c1 = cytnx.UniTensor(cytnx.random.normal([chi_bd, chi_bd],0.,1.))
c2 = cytnx.UniTensor(cytnx.random.normal([chi_bd, chi_bd],0.,1.))
c3 = cytnx.UniTensor(cytnx.random.normal([chi_bd, chi_bd],0.,1.))
# t
t0 = cytnx.UniTensor(cytnx.random.normal([chi_bd, chi_int, chi_bd],0.,1.))
t1 = cytnx.UniTensor(cytnx.random.normal([chi_bd, chi_int, chi_bd],0.,1.))
t2 = cytnx.UniTensor(cytnx.random.normal([chi_bd, chi_int, chi_bd],0.,1.))
t3 = cytnx.UniTensor(cytnx.random.normal([chi_bd, chi_int, chi_bd],0.,1.))
# w
w = cytnx.UniTensor(cytnx.random.normal([chi_int, chi_int, chi_int, chi_int],0.,1.))
    
net.PutUniTensors(["c0","c1","c2","c3"], [c0,c1,c2,c3])
net.PutUniTensors(["t0","t1","t2","t3"], [t0,t1,t2,t3])
net.PutUniTensors(["w"], [w])
    
net.setOrder(optimal=True)
print(net.getOrder())
res = net.Launch()
res.print_diagram()

@pcchen
Copy link
Collaborator

pcchen commented Nov 13, 2024

I find something strange.

If I modify the bond dimension of c0's t3-c0 bond, but DO NOT modify the bond dimension of t0's t3-c0 bond, I still can get a result from net.setOrder(optimal=True).

c0 = cytnx.UniTensor(cytnx.random.normal([chi_bd, 100],0.,1.))
t0 = cytnx.UniTensor(cytnx.random.normal([chi_bd, chi_int, chi_bd],0.,1.))
net.setOrder(optimal=True)
print(net.getOrder())
(w,((c0,c1),((c2,c3),((t0,t1),(t2,t3)))))

res = net.Launch() will result in a RuntimeError. (as expected.)

@yingjerkao
Copy link
Collaborator

yingjerkao commented Nov 14, 2024

I asked ChatGPT to compare netcon_nondisj_cpp.cpp from the paper and search_tree.cpp in Cytnx. Here is the output


The two codes share a similar purpose—finding an efficient contraction sequence for tensors in a network—but they implement the algorithm differently. Here’s a comparison of their structures and approaches:

Similarities

  1. Objective: Both codes aim to find the optimal sequence of contractions for a set of tensors, minimizing the computational cost of contracting the entire network.
  2. Cost Calculation: Both implementations calculate the contraction cost between pairs of tensors. This cost calculation considers the shape of tensors and shared indices, as tensors with common labels (indices) require additional calculations.
  3. Use of Containers: Both codes organize tensors in containers. The first code uses objectTree and objectList classes to maintain a hierarchical structure, while the second code uses nodes_container (a vector of vectors) to track tensors based on their size.

Differences

  1. Data Structures:

    • First Code: Uses objectTree, objectList, and tensorXlist to manage and track tensor contractions. These structures allow efficient insertion, lookup, and hierarchy-based management of objects. The objectTree structure also handles various constraints on tensor contraction.
    • Second Code: Primarily uses a vector-based approach (nodes_container), where each vector entry stores tensors of a specific "depth" in the contraction process. This approach is simpler and relies on fewer structures, making it potentially more direct but less flexible for certain complex tensor network topologies.
  2. Constraint Management:

    • First Code: Implements numerous constraints (e.g., outer product restrictions, index dimension constraints) to limit possible contractions and enforce efficient tensor reuse. It checks compatibility based on predefined rules and applies cost bounds (muCap and oldmuCap) to restrict contraction combinations.
    • Second Code: Implements basic restrictions by checking if tensors have overlapping IDs (indicating they have already been contracted) and ensuring some label (index) compatibility. The secondtimescan flag enables a second pass to handle Kronecker products if no valid contraction pairs are found in the first pass.
  3. Function Organization:

    • First Code: Uses numerous helper functions (getprodlegdims, addToTensorXlist, etc.) to modularize functionality. This makes the code more extensible and reusable.
    • Second Code: Encapsulates much of the functionality within the SearchTree::search_order method, making it more compact but potentially harder to extend with additional constraints.
  4. Tensor Representation:

    • First Code: Uses custom structures (objectData, tensorXlist) to represent tensors and store properties such as sequence, cost, flags, and tensor connections.
    • Second Code: Uses PseudoUniTensor with properties like ID, cost, shape, labels, and accu_str to represent each tensor’s state.
  5. Programming Complexity and Modularity:

    • First Code: Highly modular and complex, allowing fine-grained control over each tensor contraction. It uses detailed constraints to optimize contractions.
    • Second Code: Simpler and less modular, with the core logic in a single method, making it easier to follow but less adaptable to highly specialized contraction rules.

Summary

While both codes aim to optimize tensor contraction sequences, the first code offers a more complex, modular, and constraint-aware approach suitable for large, diverse tensor networks. The second code is a more streamlined implementation, better suited for simpler or smaller tensor networks without extensive constraints.


Reading from the paper, it seems we are implementing a simplified version of the algorithm, but further investigations are necessary.

@yingjerkao
Copy link
Collaborator

yingjerkao commented Nov 14, 2024

I find something strange.

If I modify the bond dimension of c0's t3-c0 bond, but DO NOT modify the bond dimension of t0's t3-c0 bond, I still can get a result from net.setOrder(optimal=True).

c0 = cytnx.UniTensor(cytnx.random.normal([chi_bd, 100],0.,1.))
t0 = cytnx.UniTensor(cytnx.random.normal([chi_bd, chi_int, chi_bd],0.,1.))
net.setOrder(optimal=True)
print(net.getOrder())
(w,((c0,c1),((c2,c3),((t0,t1),(t2,t3)))))

res = net.Launch() will result in a RuntimeError. (as expected.)

There seems no check on the bond dimensions in the optimal contraction calculation, only the common labels. Should implement this in the code.

@yingjerkao
Copy link
Collaborator

I believe Cyntx implemented a breadth-first construction as described in Sec. II.A.2 in Jutho's paper PRE 90, 033315(2014), not the most efficient method. Will need to adapt the code to the existing Cytnx implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

4 participants