Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Examples #17

Open
skn123 opened this issue May 20, 2014 · 18 comments
Open

Examples #17

skn123 opened this issue May 20, 2014 · 18 comments

Comments

@skn123
Copy link

skn123 commented May 20, 2014

Is there an example in tapkee to do the following?
a.) Load a data file from a text file
b.) Call a dimension reduction method
c.) Output the mapped data back to a text file..

@iglesias
Copy link
Collaborator

Did you browse through the examples directory? Some examples load data from files and then plot the embedded data, others write the data in the standard output.

@skn123
Copy link
Author

skn123 commented May 20, 2014

Can you point me to one such example? I would be interested in an API that is similar to the example shown on the main page of github.

@iglesias
Copy link
Collaborator

@lisitsyn
Copy link
Owner

Hey. @skn123 are you talking about command-line call example?

@skn123
Copy link
Author

skn123 commented May 20, 2014

Hi @lisitsyn I am indeed talking about that. What I would like is a way to pass as input a text file with each row corresponding to a feature vector, along with the method that I would like to implement and its parameters. After the processing, the mapped output should be written out as a text file.

@skn123
Copy link
Author

skn123 commented May 20, 2014

@iglesias Indeed, that method is close enough! @lisitsyn have you considered hiving off the K-NN part of computing the neighborhood graph as an OpenCL module?

@lisitsyn
Copy link
Owner

@skn123 quite easy. Use tapkee_cli binary with options -i input_file.dat and -o output_file.dat. Let me cite example from the help:

Run locally linear embedding with k=10 with arpack eigensolver on data from input.dat saving embedding to output.dat

tapkee -i input.dat -o output.dat --method lle --eigen-method arpack -k 10

@lisitsyn
Copy link
Owner

@skn123 yeah there is a bunch of things I like to improve with OpenCL as well - just a matter of lacking time :(

@skn123
Copy link
Author

skn123 commented May 20, 2014

@lisitsyn I am unable to build tapkee_cli using MinGW, the bug that I had mentioned earlier. If you can help me solve that issue then I would be set. Suppose I do not provide arpack as the eigen-method. Will it revert to some default eigensolver?

@lisitsyn
Copy link
Owner

@skn123 ah sorry I didn't recognize its you who had that issue :) I'll try to do resolve this issue tomorrow.

Default eigenmethod is ok, yeah, you don't have to put it explicitly.

@skn123
Copy link
Author

skn123 commented May 20, 2014

@lisitsyn Maybe you can have a sub-project for Kd-tree / K-nn using OpenCL and that would be a nice fit for Tapkee.

@lisitsyn
Copy link
Owner

@skn123 yeah or something to outsource things to :)

@skn123
Copy link
Author

skn123 commented May 21, 2014

@lisitsyn I see that Tapkee uses ltsa as one of the methods. Can you point me to the eigensolver that this method uses to compute the eigenvalues given that the matrices will be sparse.

@lisitsyn
Copy link
Owner

@skn123 it uses ARPACK as it only requires matrix-vector products. The sparsity is of no matter for ARPACK this way. Although it is possible to use other methods from Eigen3.

@skn123
Copy link
Author

skn123 commented May 21, 2014

@lisitsyn suppose I don't specify the eigensolver (which means I dont want to use ARPACK), then how will go about solving it? I may have a sparse matrix of 1 million entries !

@skn123
Copy link
Author

skn123 commented May 21, 2014

TapkeeOutput embedKernelLocalTangentSpaceAlignment()
{
    Neighbors neighbors = findNeighborsWith(kernel_distance);
    SparseWeightMatrix weight_matrix = 
        tangent_weight_matrix(begin,end,neighbors,kernel,p_target_dimension,p_eigenshift);
    DenseMatrix embedding =
        eigendecomposition(p_eigen_method,p_computation_strategy,SmallestEigenvalues,
                weight_matrix,p_target_dimension).first;

    return TapkeeOutput(embedding, unimplementedProjectingFunction());
}

So the point is; without using ARPACK and assuming that the matrix is sparse, how do you go about computing the eigenvectors?

@lisitsyn
Copy link
Owner

@skn123 ha! good point, sorry I confused you about it. It seems that w/o ARPACK it would convert to dense thing. This is a no-go for sure. I'll try to fix it as soon as I get some time (a few days I hope)

@skn123
Copy link
Author

skn123 commented May 21, 2014

@lisitsyn Even if you were to use randomization (as in red-svd), you will still face a problem of finding the "bottom" eigenvectors. Do you have any thoughts on how we can compute the bottom eigen vectors without using ARPACK ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants