Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial cut for a cuVS Java API #450

Open
wants to merge 75 commits into
base: branch-25.02
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 64 commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
c777924
Initial cut for a cuVS Java API
Nov 8, 2024
f9c2df7
API changes, serialize optional tmpFile param + other changes
Nov 9, 2024
1597158
make topK configurable
Nov 11, 2024
6f405e9
multiple fixes
Nov 14, 2024
8531be2
pom file update
Nov 14, 2024
c9b8891
sample test
Nov 14, 2024
9bb865a
sample test update + other changes
Nov 14, 2024
d40b3d0
sample test update + use logging
Nov 14, 2024
1e03585
variable name updates + other
Nov 15, 2024
3222536
Javadoc draft 1
Nov 16, 2024
c5536b6
Fixes for Javadocs, reducing visibility of internals, other refactorings
chatman Nov 17, 2024
2d79580
Loading the .so file from the jar instead of a hardcoded path
Nov 17, 2024
84c6121
package renaming and formatting
Nov 17, 2024
20562ee
moving CuVSResources + other updates
Nov 18, 2024
ad11369
code refactoring
Nov 18, 2024
6a851ef
javadoc maven config update and bug fixes
Nov 18, 2024
68e8b94
Consolidating all Arenas and Linkers to CuVSResources, fixing copyrig…
Nov 26, 2024
3852f6e
Bumping version to 24.12, installing to local maven with build.sh & s…
Nov 26, 2024
479e488
major performance improvement - reduced creation time for memory segm…
Nov 27, 2024
43f5e15
Adding randomized test, fixes for memory allocation and deallocation,…
narangvivek10 Dec 12, 2024
3438dbf
Initial cut for a cuVS Java API
Nov 8, 2024
4411629
API changes, serialize optional tmpFile param + other changes
Nov 9, 2024
b7da0ef
make topK configurable
Nov 11, 2024
c4346d9
multiple fixes
Nov 14, 2024
cefadb8
pom file update
Nov 14, 2024
8492c72
sample test
Nov 14, 2024
8e7c62c
sample test update + other changes
Nov 14, 2024
eb94a0f
sample test update + use logging
Nov 14, 2024
d32f8f9
variable name updates + other
Nov 15, 2024
bbcbba1
Javadoc draft 1
Nov 16, 2024
696f015
Fixes for Javadocs, reducing visibility of internals, other refactorings
chatman Nov 17, 2024
b95bb7f
Loading the .so file from the jar instead of a hardcoded path
Nov 17, 2024
9ab2754
package renaming and formatting
Nov 17, 2024
638092c
moving CuVSResources + other updates
Nov 18, 2024
51eb397
code refactoring
Nov 18, 2024
fb295dd
javadoc maven config update and bug fixes
Nov 18, 2024
80174d2
Consolidating all Arenas and Linkers to CuVSResources, fixing copyrig…
Nov 26, 2024
61ea9ac
Bumping version to 24.12, installing to local maven with build.sh & s…
Nov 26, 2024
1803dee
major performance improvement - reduced creation time for memory segm…
Nov 27, 2024
7cda1e5
Adding randomized test, fixes for memory allocation and deallocation,…
narangvivek10 Dec 12, 2024
53dfa23
Merge branch 'branch-25.02' into java-api
Dec 13, 2024
6f708eb
Upgrading to 25.02
Dec 13, 2024
b98d2bb
Bruteforce API implementation (#8)
narangvivek10 Dec 21, 2024
81b60f1
Bruteforce serialize and deserialize API implementation (#9)
narangvivek10 Dec 26, 2024
dca835d
Merge branch 'rapidsai:branch-25.02' into java-api
narangvivek10 Dec 31, 2024
e3757c0
`metric` parameter addition to the CAGRA index parameters (#10)
narangvivek10 Jan 1, 2025
0aa2665
update year
narangvivek10 Jan 1, 2025
a89fe5d
API update - move prefilter from BruteForceIndex to BruteForceQuery
narangvivek10 Jan 2, 2025
b44baa9
HNSW API implementation
narangvivek10 Jan 4, 2025
98c3c93
Mapping is a single integer list, added an error logging for absurd v…
Jan 6, 2025
feed385
Compilation error fix (enum needs to be mentioned as per C standards)
Jan 6, 2025
1d80b41
If dataset is smaller than topK, then adjust topK to be the dataset size
Jan 6, 2025
2f0f143
Add randomized tests for HNSW and Bruteforce
narangvivek10 Jan 7, 2025
2c0e4cd
Adding module-info.java
Jan 7, 2025
72bc6e3
Review updates
narangvivek10 Jan 8, 2025
b79a17e
Removing dependency on slf4j and commons-io
Jan 10, 2025
6819bdf
Plugin configuration update in pom.xml and javadoc fix
Jan 10, 2025
bc280e4
Review feedback updates
narangvivek10 Jan 22, 2025
392d0ff
Feature: get GPU information
narangvivek10 Jan 23, 2025
a0dba4a
Merge branch 'branch-25.02' into java-api
narangvivek10 Jan 23, 2025
bfd355b
fix cmake format and update year
Jan 24, 2025
ab1c53f
Adding CMAKE_PREFIX_PATH variable back to the Java build.sh script, a…
Jan 24, 2025
a197d52
Add examples, update GPU info API methods, and update CI script
narangvivek10 Jan 24, 2025
69ef479
Merge branch 'branch-25.02' into java-api
narangvivek10 Jan 24, 2025
7eb00d5
Ensure Arena is closed when resources is closed
ChrisHegarty Jan 27, 2025
fd5e1c1
Merge remote-tracking branch 'rapidsai/branch-25.02' into java-api
Jan 27, 2025
a24b3df
Fixing compile error due to additional filter param for cagra search,…
Jan 27, 2025
daedfcd
Merge branch 'branch-25.02' into java-api
narangvivek10 Jan 27, 2025
cf6efdd
close arena only if not alive, update version in pom.xml and cleanup
Jan 27, 2025
704d8d6
Refactor GPU info API, search results and simplify mapping for hnsw a…
narangvivek10 Jan 28, 2025
37c5dba
Update Java versioning in the CI script
narangvivek10 Jan 28, 2025
c0d146b
Merge branch 'branch-25.02' into java-api
narangvivek10 Jan 28, 2025
8e42da0
Build script update, revert files with only copyright changes, and up…
narangvivek10 Jan 28, 2025
136f658
cleanup
Jan 28, 2025
94a02fb
Merge branch 'branch-25.02' into java-api
narangvivek10 Jan 29, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -83,3 +83,8 @@ ivf_pq_index
# cuvs_bench
datasets/
/*.json

# java
.classpath


14 changes: 12 additions & 2 deletions build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,15 @@ ARGS=$*
# scripts, and that this script resides in the repo dir!
REPODIR=$(cd $(dirname $0); pwd)

VALIDARGS="clean libcuvs python rust docs tests bench-ann examples --uninstall -v -g -n --compile-static-lib --allgpuarch --no-mg --no-cpu --cpu-only --no-shared-libs --no-nvtx --show_depr_warn --incl-cache-stats --time -h"
VALIDARGS="clean libcuvs python rust java docs tests bench-ann examples --uninstall -v -g -n --compile-static-lib --allgpuarch --no-mg --no-cpu --cpu-only --no-shared-libs --no-nvtx --show_depr_warn --incl-cache-stats --time -h"
HELP="$0 [<target> ...] [<flag> ...] [--cmake-args=\"<args>\"] [--cache-tool=<tool>] [--limit-tests=<targets>] [--limit-bench-ann=<targets>] [--build-metrics=<filename>]
where <target> is:
clean - remove all existing build artifacts and configuration (start over)
libcuvs - build the cuvs C++ code only. Also builds the C-wrapper library
around the C++ code.
python - build the cuvs Python package
rust - build the cuvs Rust bindings
java - build the cuvs Java bindings
docs - build the documentation
tests - build the tests
bench-ann - build end-to-end ann benchmarks
Expand Down Expand Up @@ -61,7 +62,8 @@ SPHINX_BUILD_DIR=${REPODIR}/docs
DOXYGEN_BUILD_DIR=${REPODIR}/cpp/doxygen
PYTHON_BUILD_DIR=${REPODIR}/python/cuvs/_skbuild
RUST_BUILD_DIR=${REPODIR}/rust/target
BUILD_DIRS="${LIBCUVS_BUILD_DIR} ${PYTHON_BUILD_DIR} ${RUST_BUILD_DIR}"
JAVA_BUILD_DIR=${REPODIR}/java/cuvs-java/target
BUILD_DIRS="${LIBCUVS_BUILD_DIR} ${PYTHON_BUILD_DIR} ${RUST_BUILD_DIR} ${JAVA_BUILD_DIR}"

# Set defaults for vars modified by flags to this script
CMAKE_LOG_LEVEL=""
Expand Down Expand Up @@ -445,6 +447,14 @@ if (( ${NUMARGS} == 0 )) || hasArg rust; then
cargo test
fi

# Build the cuvs Java bindings
if (( ${NUMARGS} == 0 )) || hasArg java; then
# build libcuvs first as the Java API depends on it
./$0 libcuvs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it is good practice to call this script recursively to build the C++ code. It seems like it will lead to unexpected behavior. For example, how do you ensure that the CLI arguments such as --allgpuarch or --no-mg are being passed through?

I would recommend removing this line, and users must call ./build.sh libcuvs java if the C++ has not yet been built. Perhaps a warning can be issued if hasArg libcuvs is false. I believe this is how we handle the Python builds, which also depend on C++ libraries being built.

Suggested change
./$0 libcuvs

cd ${REPODIR}/java
cjnolet marked this conversation as resolved.
Show resolved Hide resolved
./build.sh
fi

export RAPIDS_VERSION="$(sed -E -e 's/^([0-9]{2})\.([0-9]{2})\.([0-9]{2}).*$/\1.\2.\3/' "${REPODIR}/VERSION")"
export RAPIDS_VERSION_MAJOR_MINOR="$(sed -E -e 's/^([0-9]{2})\.([0-9]{2})\.([0-9]{2}).*$/\1.\2/' "${REPODIR}/VERSION")"

Expand Down
6 changes: 6 additions & 0 deletions ci/release/update-version.sh
Original file line number Diff line number Diff line change
Expand Up @@ -96,3 +96,9 @@ find .devcontainer/ -type f -name devcontainer.json -print0 | while IFS= read -r
sed_runner "s@rapidsai/devcontainers/features/rapids-build-utils:[0-9.]*@rapidsai/devcontainers/features/rapids-build-utils:${NEXT_SHORT_TAG_PEP440}@" "${filename}"
sed_runner "s@rapids-\${localWorkspaceFolderBasename}-${CURRENT_SHORT_TAG}@rapids-\${localWorkspaceFolderBasename}-${NEXT_SHORT_TAG}@g" "${filename}"
done

# Update Java API version
sed_runner "s/VERSION=\".*\"/VERSION=\"${NEXT_FULL_TAG}\"/g" java/build.sh
for FILE in java/*/pom.xml; do
sed_runner "/<!--CUVS_JAVA#VERSION_UPDATE_MARKER_START-->.*<!--CUVS_JAVA#VERSION_UPDATE_MARKER_END-->/s//<!--CUVS_JAVA#VERSION_UPDATE_MARKER_START--><version>${NEXT_FULL_TAG}<\/version><!--CUVS_JAVA#VERSION_UPDATE_MARKER_END-->/g" "${FILE}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently inconsistent with our normal use of the script. We usually run ./ci/release/update-version.sh 25.02.00 which would assign a value of 25.02.00. However, the pom.xml files use 25.02 without the patch version. I would guess that we do want to have a patch version here, but perhaps we want to normalize it to 25.02.0. That's close to what we do in cuDF's Java code.

Please look at https://github.com/rapidsai/cudf/blob/133e0c869531af94474e0bbb66cb22c5f8ba80f2/ci/release/update-version.sh#L87-L91 and https://github.com/rapidsai/cudf/blob/133e0c869531af94474e0bbb66cb22c5f8ba80f2/java/pom.xml#L24 and see whether we should adopt the same patterns here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Bradley, I've updated the version number format to include a patch number also. This should be better for Java artifacts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should normalize it to 25.02.0 instead of 25.02.00? That would align with what we do for Java code elsewhere in RAPIDS, including cuDF and KvikIO.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @bdice I will update the versioning following your suggestion above and push this change soon.

done
2 changes: 1 addition & 1 deletion cpp/cmake/thirdparty/get_dlpack.cmake
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert any files with only copyright changes.

Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# =============================================================================
# Copyright (c) 2024, NVIDIA CORPORATION.
# Copyright (c) 2025, NVIDIA CORPORATION.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except
# in compliance with the License. You may obtain a copy of the License at
Expand Down
2 changes: 1 addition & 1 deletion cpp/include/cuvs/neighbors/hnsw.h
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ enum cuvsHnswHierarchy {

struct cuvsHnswIndexParams {
/* hierarchy of the hnsw index */
cuvsHnswHierarchy hierarchy;
enum cuvsHnswHierarchy hierarchy;
/** Size of the candidate list during hierarchy construction when hierarchy is `CPU`*/
int ef_construction;
/** Number of host threads to use to construct hierarchy when hierarchy is `CPU`
Expand Down
14 changes: 14 additions & 0 deletions java/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
Prerequisites
-------------

* JDK 22
* Maven 3.9.6 or later

To build this API, please do `./build.sh java` in the top level directory. Since this API is dependent on `libcuvs` it must be noted that `libcuvs` gets built automatically before building this API.

Alternatively, please build libcuvs (`./build.sh libcuvs` from top level directory) before building the Java API with `./build.sh` from this directory.

Building
--------

`./build.sh` will generate the libcuvs_java.so file in internal/ directory, and then build the final jar file for the cuVS Java API in cuvs-java/ directory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`./build.sh` will generate the libcuvs_java.so file in internal/ directory, and then build the final jar file for the cuVS Java API in cuvs-java/ directory.
`./build.sh` will generate the `libcuvs_java.so` file in the `internal/` directory, and then build the final jar file for the cuVS Java API in the `cuvs-java/` directory.

14 changes: 14 additions & 0 deletions java/build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
VERSION="25.02" # Note: The version is updated automatically when ci/release/update-version.sh is invoked
bdice marked this conversation as resolved.
Show resolved Hide resolved
GROUP_ID="com.nvidia.cuvs"
SO_FILE_PATH="./internal"

if [ -z "$CMAKE_PREFIX_PATH" ]; then
export CMAKE_PREFIX_PATH=`pwd`/../cpp/build
fi

cd internal && cmake . && cmake --build . \
&& cd .. \
&& mvn install:install-file -DgroupId=$GROUP_ID -DartifactId=cuvs-java-internal -Dversion=$VERSION -Dpackaging=so -Dfile=$SO_FILE_PATH/libcuvs_java.so \
&& cd cuvs-java \
&& mvn package \
&& mvn install:install-file -Dfile=./target/cuvs-java-$VERSION-jar-with-dependencies.jar -DgroupId=$GROUP_ID -DartifactId=cuvs-java -Dversion=$VERSION -Dpackaging=jar
1 change: 1 addition & 0 deletions java/cuvs-java/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
/target/
159 changes: 159 additions & 0 deletions java/cuvs-java/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@
<!--
/*
* Copyright (c) 2025, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
-->
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.nvidia.cuvs</groupId>
<artifactId>cuvs-java</artifactId>
<!-- NOTE: The version automatically gets updated when ci/release/update-version.sh is invoked. -->
<!--CUVS_JAVA#VERSION_UPDATE_MARKER_START--><version>25.02</version><!--CUVS_JAVA#VERSION_UPDATE_MARKER_END-->
bdice marked this conversation as resolved.
Show resolved Hide resolved
<name>cuvs-java</name>
<packaging>jar</packaging>

<properties>
<maven.compiler.target>22</maven.compiler.target>
<maven.compiler.source>22</maven.compiler.source>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
</properties>

<dependencies>

<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.36</version>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-simple</artifactId>
<version>1.7.36</version>
<scope>test</scope>
</dependency>

<dependency>
<groupId>com.carrotsearch.randomizedtesting</groupId>
<artifactId>randomizedtesting-runner</artifactId>
<version>2.8.2</version>
<scope>test</scope>
</dependency>

<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.13.1</version>
<scope>test</scope>
</dependency>

</dependencies>

<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>3.5.2</version>
<configuration>
<parallel>suites</parallel>
<threadCountSuites>1</threadCountSuites>
<perCoreThreadCount>false</perCoreThreadCount>
<systemPropertyVariables>
<java.library.path>${project.build.directory}/classes</java.library.path>
</systemPropertyVariables>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-dependency-plugin</artifactId>
<version>2.10</version>
<executions>
<execution>
<id>copy</id>
<phase>compile</phase>
<goals>
<goal>copy</goal>
</goals>
<configuration>
<artifactItems>
<artifactItem>
<groupId>com.nvidia.cuvs</groupId>
<artifactId>cuvs-java-internal</artifactId>
<!-- NOTE: The version automatically gets updated when ci/release/update-version.sh is invoked. -->
<!--CUVS_JAVA#VERSION_UPDATE_MARKER_START--><version>25.02</version><!--CUVS_JAVA#VERSION_UPDATE_MARKER_END-->
<type>so</type>
<overWrite>false</overWrite>
<outputDirectory>
${project.build.directory}/classes</outputDirectory>
<destFileName>libcuvs_java.so</destFileName>
</artifactItem>
</artifactItems>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>3.4.2</version>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
<archiverConfig>
<duplicateBehavior>add</duplicateBehavior>
</archiverConfig>
</configuration>
<executions>
<execution>
<id>assemble-all</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>2.2</version>
<configuration>
<archive>
<manifest>
<addClasspath>true</addClasspath>
<mainClass>
com.nvidia.cuvs.examples.CagraExample</mainClass>
</manifest>
</archive>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-javadoc-plugin</artifactId>
<version>3.6.2</version>
<configuration>
<excludePackageNames>com.nvidia.cuvs.examples,com.nvidia.cuvs.panama</excludePackageNames>
<reportOutputDirectory>${project.build.directory}</reportOutputDirectory>
</configuration>
</plugin>
</plugins>
</build>
</project>
Loading
Loading