Skip to content

Commit

Permalink
API,MAINT: RAPIDS 24.12 and align DatetimeComponent with pylibcudf (#15)
Browse files Browse the repository at this point in the history
Legate bumped to using rapids-cmake 24.12, so let's use RAPIDS 24.12
also.

Most changes are just search/replace, but:
* This enables CI for Python 3.12, I forgot about that when we went to
24.10 (I think that was compatible).
* `libcudf` doesn't ship `libcudftestutil` anymore, instead we need to
add it to our link targets (and the compile flags) to compile it here.
* **API change:** `(py)libcudf` now has `etract_component` and
`DatetimeComponent`. So align with that. This means:
* Python code needs to use the capitalizem, i.e. `DatetimeComponent.DAY`
now.
* `day_of_year` is a dedicated function in `(py)libcudf`, so it is gone
for now.
  * (On the C-side, the same change would be needed.)

---------

Signed-off-by: Sebastian Berg <[email protected]>
  • Loading branch information
seberg authored Jan 29, 2025
1 parent 59b04a5 commit a870b4f
Show file tree
Hide file tree
Showing 17 changed files with 96 additions and 158 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/conda-python-build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ jobs:
PY_VER:
- "3.10"
- "3.11"
# - "3.12" requires update to RAPIDS 24.10
- "3.12"
runs-on: linux-${{ matrix.ARCH }}-cpu4
container:
image: "rapidsai/ci-conda:cuda${{ matrix.CUDA_VER }}-ubuntu22.04-py${{ matrix.PY_VER }}"
Expand Down
5 changes: 4 additions & 1 deletion .github/workflows/pr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -48,14 +48,17 @@ jobs:
#
# * architectures: amd64 only
# * CUDA: >=12.2
# * Python: 3.10, 3.11, 3.12
# * Python: 3.10, 3.11, 3.12 (3.11 used in doc building)
#
# Valid set of RAPIDS ci-conda image tags: https://hub.docker.com/r/rapidsai/ci-conda/tags
matrix:
include:
- ARCH: "amd64"
CUDA_VER: "12.5.1"
PY_VER: "3.10"
- ARCH: "amd64"
CUDA_VER: "12.5.1"
PY_VER: "3.12"
runs-on: linux-${{ matrix.ARCH }}-gpu-v100-latest-1
container:
image: "rapidsai/ci-conda:cuda${{ matrix.CUDA_VER }}-ubuntu22.04-py${{ matrix.PY_VER }}"
Expand Down
11 changes: 3 additions & 8 deletions ci/run_ctests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,8 @@ else
cd "${INSTALL_PREFIX:-${CONDA_PREFIX:-/usr}}/bin/gtests/liblegate_dataframe/"
fi

# Unless otherwise specified, use all available GPUs and set
# Unless `LEGATE_CONFIG` is set, default to all available GPUs and set fbmem/sysmem.
# LEGATE_TEST=1 to test broadcasting code paths (locally).
# TODO: Set LEGATE_CONFIG instead (if undefined). However,
# as of 2024-10-11 LEGATE_CONFIG seems broken:
# https://github.com/nv-legate/legate.core.internal/issues/1304
LEGATE_CONFIG=${LEGATE_CONFIG:- --gpus="$(nvidia-smi -L | wc -l) --fbmem=4000 --sysmem=4000"} \
LEGATE_TEST=${LEGATE_TEST:-1} \
legate \
--gpus "$(nvidia-smi -L | wc -l)" \
--fbmem=4000 --sysmem=4000 \
./cpp_tests --output-on-failure --no-tests=error "$@"
legate ./cpp_tests --output-on-failure --no-tests=error "$@"
8 changes: 2 additions & 6 deletions ci/run_pytests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,15 +18,11 @@ set -e -E -u -o pipefail
# Support invoking run_cudf_pytests.sh outside the script directory
cd "$(dirname "$(realpath "${BASH_SOURCE[0]}")")"/../python/tests/

# Unless otherwise specified, use all available GPUs and set
# Unless `LEGATE_CONFIG` is set, default to all available GPUs and set fbmem/sysmem.
# LEGATE_TEST=1 to test broadcasting code paths (locally).
# TODO: Set LEGATE_CONFIG instead (if undefined). However,
# as of 2024-10-11 LEGATE_CONFIG seems broken:
# https://github.com/nv-legate/legate.core.internal/issues/1304
LEGATE_CONFIG=${LEGATE_CONFIG:- --gpus="$(nvidia-smi -L | wc -l) --fbmem=4000 --sysmem=4000"} \
LEGATE_TEST=${LEGATE_TEST:-1} \
legate \
--gpus "$(nvidia-smi -L | wc -l)"\
--fbmem=4000 \
--module pytest \
. \
-sv \
Expand Down
12 changes: 6 additions & 6 deletions conda/environments/all_cuda-124_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,24 +17,24 @@ dependencies:
- cuda-profiler-api
- cuda-sanitizer-api
- cuda-version=12.4
- cudf==24.10.*,>=0.0.0a0
- cudf==24.12.*,>=0.0.0a0
- cupy>=12.0.0
- cupynumeric==25.01.*,>=0.0.0.dev0
- cxx-compiler
- cython>=3.0.3
- dask-cuda==24.10.*
- dask-cudf==24.10.*
- dask-cuda==24.12.*
- dask-cudf==24.12.*
- gcc_linux-64=11.*
- legate==25.01.*,>=0.0.0.dev0
- libcudf==24.10.*,>=0.0.0a0
- librmm==24.10.*,>=0.0.0a0
- libcudf==24.12.*,>=0.0.0a0
- librmm==24.12.*,>=0.0.0a0
- make
- myst-parser>=4.0
- ninja
- numpy >=1.23,<3.0.0a0
- openssh
- pydata-sphinx-theme>=0.16.0
- pylibcudf==24.10.*,>=0.0.0a0
- pylibcudf==24.12.*,>=0.0.0a0
- pytest>=7.0
- python>=3.10,<3.13
- rapids-build-backend>=0.3.2,<0.4.0.dev0
Expand Down
7 changes: 6 additions & 1 deletion conda/recipes/legate-dataframe/conda_build_config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,5 +22,10 @@ cuda11_compiler:
legate_version:
- "=25.01.*,>=0.0.0.dev0"

# TODO: The != temporary blocklist cupynumeric versions
# using cupynumeric, because it is fewer versions to block.
cupynumeric_version:
- "=25.01.*,>=0.0.0.dev0,!=25.01.0.dev62,!=25.01.0.dev61,!=25.01.0.dev60,!=25.01.0.rc1"

rapids_version:
- =24.10.*
- =24.12.*
4 changes: 2 additions & 2 deletions conda/recipes/legate-dataframe/meta.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright (c) 2024, NVIDIA CORPORATION.
# Copyright (c) 2024-2025, NVIDIA CORPORATION.

{% set pyproject_data = load_file_data("python/pyproject.toml") %}
{% set version = environ['LEGATEDATAFRAME_PACKAGE_VERSION'] %}
Expand Down Expand Up @@ -60,7 +60,7 @@ requirements:
# Only to ensure a nightly legate version we pick up
# is compatible with an existing cupynumeric version.
# (may also stabilize not using debug/sanitizer builds)
- cupynumeric
- cupynumeric {{ cupynumeric_version }}
- python
- pip
- cython >=3.0.3
Expand Down
2 changes: 1 addition & 1 deletion cpp/cmake/fetch_rapids.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
# the License.
# =============================================================================
if(NOT EXISTS ${CMAKE_CURRENT_BINARY_DIR}/LEGATE_DATAFRAME_RAPIDS.cmake)
file(DOWNLOAD https://raw.githubusercontent.com/rapidsai/rapids-cmake/branch-24.10/RAPIDS.cmake
file(DOWNLOAD https://raw.githubusercontent.com/rapidsai/rapids-cmake/branch-24.12/RAPIDS.cmake
${CMAKE_CURRENT_BINARY_DIR}/LEGATE_DATAFRAME_RAPIDS.cmake
)
endif()
Expand Down
2 changes: 1 addition & 1 deletion cpp/cmake/thirdparty/get_cudf.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -51,5 +51,5 @@ function(find_and_configure_cudf)
endfunction()

find_and_configure_cudf(
VERSION 24.10 GIT_REPO https://github.com/rapidsai/cudf.git GIT_TAG branch-24.10
VERSION 24.12 GIT_REPO https://github.com/rapidsai/cudf.git GIT_TAG branch-24.12
)
20 changes: 4 additions & 16 deletions cpp/include/legate_dataframe/timestamps.hpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2023-2024, NVIDIA CORPORATION.
* Copyright (c) 2023-2025, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand All @@ -18,6 +18,7 @@

#include <string>

#include <cudf/datetime.hpp>
#include <cudf/types.hpp>

#include <legate_dataframe/core/column.hpp>
Expand Down Expand Up @@ -79,27 +80,14 @@ LogicalColumn to_timestamps(const LogicalColumn& input,
cudf::data_type timestamp_type,
std::string format);

enum class DatetimeComponent : int32_t {
year,
month,
day,
weekday,
hour,
minute,
second,
millisecond_fraction,
microsecond_fraction,
nanosecond_fraction,
day_of_year
};

/**
* @brief Extracts part of a timestamp as a int16.
*
* @param input Timestamp column
* @param component The component which to extract.
* @return New int16 column.
*/
LogicalColumn extract_timestamp_component(const LogicalColumn& input, DatetimeComponent component);
LogicalColumn extract_timestamp_component(const LogicalColumn& input,
cudf::datetime::datetime_component component);

} // namespace legate::dataframe
63 changes: 11 additions & 52 deletions cpp/src/timestamps.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2023-2024, NVIDIA CORPORATION.
* Copyright (c) 2023-2025, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand All @@ -16,10 +16,7 @@

#include <string>

// cudf's detail API of datetime assume that RMM has been included
#include <rmm/mr/device/device_memory_resource.hpp>

#include <cudf/detail/datetime.hpp>
#include <cudf/datetime.hpp>
#include <cudf/strings/convert/convert_datetime.hpp>

#include <legate_dataframe/core/column.hpp>
Expand Down Expand Up @@ -52,52 +49,13 @@ class ExtractTimestampComponentTask
static void gpu_variant(legate::TaskContext context)
{
GPUTaskContext ctx{context};
const auto component =
argument::get_next_scalar<std::underlying_type_t<DatetimeComponent>>(ctx);
const auto input = argument::get_next_input<PhysicalColumn>(ctx);
auto output = argument::get_next_output<PhysicalColumn>(ctx);
const auto component = argument::get_next_scalar<cudf::datetime::datetime_component>(ctx);
const auto input = argument::get_next_input<PhysicalColumn>(ctx);
auto output = argument::get_next_output<PhysicalColumn>(ctx);

std::unique_ptr<cudf::column> ret;
/* unfortunately, there seems to be no templating for this in libcudf: */
switch (static_cast<DatetimeComponent>(component)) {
case DatetimeComponent::year:
ret = cudf::datetime::detail::extract_year(input.column_view(), ctx.stream(), ctx.mr());
break;
case DatetimeComponent::month:
ret = cudf::datetime::detail::extract_month(input.column_view(), ctx.stream(), ctx.mr());
break;
case DatetimeComponent::day:
ret = cudf::datetime::detail::extract_day(input.column_view(), ctx.stream(), ctx.mr());
break;
case DatetimeComponent::weekday:
ret = cudf::datetime::detail::extract_weekday(input.column_view(), ctx.stream(), ctx.mr());
break;
case DatetimeComponent::hour:
ret = cudf::datetime::detail::extract_hour(input.column_view(), ctx.stream(), ctx.mr());
break;
case DatetimeComponent::minute:
ret = cudf::datetime::detail::extract_minute(input.column_view(), ctx.stream(), ctx.mr());
break;
case DatetimeComponent::second:
ret = cudf::datetime::detail::extract_second(input.column_view(), ctx.stream(), ctx.mr());
break;
case DatetimeComponent::millisecond_fraction:
ret = cudf::datetime::detail::extract_millisecond_fraction(
input.column_view(), ctx.stream(), ctx.mr());
break;
case DatetimeComponent::microsecond_fraction:
ret = cudf::datetime::detail::extract_microsecond_fraction(
input.column_view(), ctx.stream(), ctx.mr());
break;
case DatetimeComponent::nanosecond_fraction:
ret = cudf::datetime::detail::extract_nanosecond_fraction(
input.column_view(), ctx.stream(), ctx.mr());
break;
case DatetimeComponent::day_of_year:
ret = cudf::datetime::detail::day_of_year(input.column_view(), ctx.stream(), ctx.mr());
break;
default: throw std::runtime_error("invalid resolution to time part extraction?");
}
ret = cudf::datetime::extract_datetime_component(
input.column_view(), component, ctx.stream(), ctx.mr());

output.move_into(std::move(ret));
}
Expand All @@ -119,7 +77,8 @@ LogicalColumn to_timestamps(const LogicalColumn& input,
return ret;
}

LogicalColumn extract_timestamp_component(const LogicalColumn& input, DatetimeComponent component)
LogicalColumn extract_timestamp_component(const LogicalColumn& input,
cudf::datetime::datetime_component component)
{
if (!cudf::is_timestamp(input.cudf_type())) {
throw std::invalid_argument("extract_timestamp_component() input must be timestamp");
Expand All @@ -128,8 +87,8 @@ LogicalColumn extract_timestamp_component(const LogicalColumn& input, DatetimeCo
auto ret = LogicalColumn::empty_like(cudf::data_type{cudf::type_id::INT16}, input.nullable());
legate::AutoTask task =
runtime->create_task(get_library(), task::ExtractTimestampComponentTask::TASK_ID);
argument::add_next_scalar(task,
static_cast<std::underlying_type_t<DatetimeComponent>>(component));
argument::add_next_scalar(
task, static_cast<std::underlying_type_t<cudf::datetime::datetime_component>>(component));
argument::add_next_input(task, input);
argument::add_next_output(task, ret);
runtime->submit(std::move(task));
Expand Down
7 changes: 5 additions & 2 deletions cpp/tests/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -34,10 +34,13 @@ set_target_properties(
CUDA_STANDARD_REQUIRED ON
)

set(LDF_TEST_CUDA_FLAGS --expt-extended-lambda --expt-relaxed-constexpr)
target_compile_options(cpp_tests PRIVATE "$<$<COMPILE_LANGUAGE:CUDA>:${LDF_TEST_CUDA_FLAGS}>")

# Note that fmt::fmt should not be required, but seems to be for debug builds.
target_link_libraries(
cpp_tests PRIVATE LegateDataframe cudf cudf::cudftestutil GTest::gmock GTest::gtest fmt::fmt
$<TARGET_NAME_IF_EXISTS:conda_env>
cpp_tests PRIVATE LegateDataframe cudf cudf::cudftestutil cudf::cudftestutil_impl
GTest::gmock GTest::gtest fmt::fmt $<TARGET_NAME_IF_EXISTS:conda_env>
)
rapids_test_add(
NAME cpp_tests
Expand Down
20 changes: 10 additions & 10 deletions dependencies.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -175,8 +175,8 @@ dependencies:
packages:
- cupynumeric==25.01.*,>=0.0.0.dev0
- pytest>=7.0
- dask-cuda==24.10.*
- dask-cudf==24.10.*
- dask-cuda==24.12.*
- dask-cudf==24.12.*
- output_types: conda
packages:
- cuda-sanitizer-api
Expand All @@ -187,17 +187,17 @@ dependencies:
common:
- output_types: conda
packages:
- &cudf_unsuffixed cudf==24.10.*,>=0.0.0a0
- &pylibcudf_unsuffixed pylibcudf==24.10.*,>=0.0.0a0
- &cudf_unsuffixed cudf==24.12.*,>=0.0.0a0
- &pylibcudf_unsuffixed pylibcudf==24.12.*,>=0.0.0a0
specific:
- output_types: [requirements, pyproject]
matrices:
- matrix:
cuda: "12.*"
cuda_suffixed: "true"
packages:
- cudf-cu12==24.10.*,>=0.0.0a0
- pylibcudf-cu12==24.10.*,>=0.0.0a0
- cudf-cu12==24.12.*,>=0.0.0a0
- pylibcudf-cu12==24.12.*,>=0.0.0a0
- {matrix: null, packages: [*cudf_unsuffixed, *pylibcudf_unsuffixed]}

depends_on_cupy:
Expand All @@ -223,30 +223,30 @@ dependencies:
common:
- output_types: conda
packages:
- &libcudf_unsuffixed libcudf==24.10.*,>=0.0.0a0
- &libcudf_unsuffixed libcudf==24.12.*,>=0.0.0a0
specific:
- output_types: [requirements, pyproject]
matrices:
- matrix:
cuda: "12.*"
cuda_suffixed: "true"
packages:
- libcudf-cu12==24.10.*,>=0.0.0a0
- libcudf-cu12==24.12.*,>=0.0.0a0
- {matrix: null, packages: [*libcudf_unsuffixed]}

depends_on_librmm:
common:
- output_types: conda
packages:
- &librmm_unsuffixed librmm==24.10.*,>=0.0.0a0
- &librmm_unsuffixed librmm==24.12.*,>=0.0.0a0
specific:
- output_types: [requirements, pyproject]
matrices:
- matrix:
cuda: "12.*"
cuda_suffixed: "true"
packages:
- librmm-cu12==24.10.*,>=0.0.0a0
- librmm-cu12==24.12.*,>=0.0.0a0
- matrix:
packages:
- *librmm_unsuffixed
7 changes: 5 additions & 2 deletions python/legate_dataframe/lib/timestamps.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,18 @@
# SPDX-License-Identifier: Apache-2.0

from numpy.typing import DTypeLike
from pylibcudf.datetime import DatetimeComponent

from legate_dataframe.lib.core.column import LogicalColumn

__all__ = ["to_timestamps", "extract_timestamp_component", "DatetimeComponent"]

def to_timestamps(
col: LogicalColumn,
timestamp_type: DTypeLike,
format_pattern: str,
) -> LogicalColumn: ...
def extract_timepart(
def extract_timestamp_component(
col: LogicalColumn,
resolution: str,
component: DatetimeComponent,
) -> LogicalColumn: ...
Loading

0 comments on commit a870b4f

Please sign in to comment.