Setting Up CI For Wheel-Based CuPy Environments

Nov 27, 2025 by Alex Johnson 48 views

In the realm of software development, Continuous Integration (CI) plays a pivotal role in ensuring code quality and reliability. For projects like CuPy, which heavily relies on compiled extensions and CUDA toolkits, setting up a robust CI system is crucial. This article delves into the process of establishing a CI pipeline specifically designed to test fully wheel-based CuPy environments, ensuring that the pathfinder mechanism functions correctly. This is a follow-up to discussion initiated in issue #9444, addressing the necessity of testing pure-wheel environments where no CTK (CUDA Toolkit) is pre-installed, neither in the Virtual Machine (VM) nor the container.

Understanding the Importance of Wheel-Based Testing

Testing in a pure-wheel environment is vital for several reasons. Wheels are the standard distribution format for Python packages, and ensuring that CuPy functions correctly when installed from wheels guarantees a consistent experience for users. This approach eliminates potential issues arising from variations in system configurations or pre-installed libraries. A wheel-based testing strategy ensures that the CuPy package, along with its dependencies, can be installed and utilized seamlessly across different environments. This is particularly important for libraries like CuPy, which often interact with hardware accelerators like GPUs and require specific CUDA toolkit versions. By focusing on wheel-based installations, we can isolate and resolve issues related to the packaging and distribution process itself.

The Basic Recipe for CI Setup

The core of our CI setup involves several key steps, each designed to isolate and test specific aspects of the CuPy installation and functionality. Here’s a detailed breakdown of the recipe:

Building CuPy Wheels: The initial step involves building CuPy wheels in the standard manner. These wheels encapsulate the compiled code and metadata necessary for installation. This process should mirror the typical release procedure to ensure consistency.
Uploading Wheels to a Scratch Space: Once the wheels are built, they need to be stored in a temporary location, often referred to as a scratch space. This space acts as a repository for the wheels, making them accessible to subsequent CI stages. This step ensures that the wheels are readily available for installation in the test environments.
Initiating the Test Stage on a GPU Node: The heart of the testing process lies in executing tests on a GPU-enabled node. This ensures that CuPy's GPU-accelerated functionalities are thoroughly evaluated. The test stage is triggered once the wheels are built and uploaded.
Downloading Wheels from Scratch: Within the test stage, the first task is to retrieve the CuPy wheels from the scratch space. This step simulates the installation process that a user would typically follow.
Starting a Vanilla Ubuntu Container with GPU Support: To create a controlled and isolated environment, a vanilla Ubuntu container is initiated. This container provides a clean slate, free from any pre-existing installations or configurations that could interfere with the tests. The --gpus=all flag ensures that the container has access to all available GPUs on the host system.
Creating a Python Virtual Environment (venv): A Python virtual environment (venv) is created within the container. This isolates the CuPy installation and its dependencies from the system-level Python installation, preventing conflicts and ensuring reproducibility.
Installing the CuPy Wheel via pip: The CuPy wheel is installed using pip, the Python package installer. The installation command includes the [ctk] extra, which specifies the CUDA Toolkit dependency. This ensures that the correct CUDA Toolkit version is installed alongside CuPy. For example, the command might look like this: pip install "./cupy_cuda12x-13.6.0-cp313-cp313-manylinux2014_x86_64.whl[ctk]".
Running pytest to Test CuPy against CTK Wheels: Finally, pytest is used to execute the CuPy test suite. These tests verify the functionality of CuPy against the specified CUDA Toolkit version. This step ensures that CuPy operates correctly with the intended CUDA Toolkit.

Leveraging Existing Examples and Infrastructure

The CUDA Python ecosystem offers valuable precedents for setting up CI pipelines. The example found in NVIDIA/numba-cuda#604 provides a self-contained illustration of a CI setup for wheel-based testing. This example can serve as a blueprint for the CuPy CI pipeline, offering insights into the necessary configurations and scripts. By adapting successful strategies from related projects, we can accelerate the development and deployment of our CI system.

CI Pipeline Configuration

To ensure comprehensive testing, we should aim for a matrix of CI pipelines that cover different CUDA versions and operating systems. A minimum of 2 * 2 = 4 CI pipelines is recommended, encompassing CUDA 12 and 13, each tested on both Linux (x64) and Windows (x64). This matrix ensures that CuPy is thoroughly tested across the most common deployment environments. For the sake of efficiency, the Python version can be fixed initially, as CUDA wheels and the pathfinder wheel are architecture-independent (noarch). This reduces the computational cost without sacrificing test coverage.

Addressing Challenges and Future Considerations

The primary challenge in setting up this CI pipeline lies in the accessibility of GPU runners for the test stage. The existing GPU runners may not be publicly accessible, which necessitates specific configurations and permissions. Overcoming this hurdle is crucial for the successful implementation of the CI system. In the future, expanding the CI matrix to include additional CUDA versions and operating systems can further enhance test coverage. Additionally, integrating automated performance testing into the CI pipeline can provide valuable insights into the efficiency of CuPy's GPU-accelerated operations.

Steps Breakdown for Enhanced Clarity

To further clarify the CI process, let's break down each step with more detail:

1. Building CuPy Wheels

This initial stage is where the CuPy package is compiled and packaged into wheels. Wheels are a standardized format for distributing Python packages, making installation easier and more consistent. The build process involves compiling the CuPy source code, including any necessary CUDA kernels, and packaging them along with metadata into a .whl file. This step is crucial because it creates the artifact that will be installed and tested in subsequent stages. The wheels need to be built for specific Python versions and CUDA toolkit versions to ensure compatibility. For instance, a wheel might be named cupy_cuda12x-13.6.0-cp313-cp313-manylinux2014_x86_64.whl, indicating it's built for CUDA 12.x, CuPy version 13.6.0, Python 3.13, and the manylinux2014 standard for Linux compatibility.

2. Uploading Wheels to a Scratch Space

Once the wheels are built, they are uploaded to a scratch space. This is a temporary storage location where the wheels can be accessed by other stages in the CI pipeline. The scratch space acts as a central repository, ensuring that the wheels are readily available for installation in the test environments. This step is essential for decoupling the build stage from the test stage, allowing each to run independently. The scratch space might be a cloud storage service, an internal file server, or any other accessible storage medium.

3. Initiating the Test Stage on a GPU Node

The test stage is where the actual testing of CuPy takes place. This stage is initiated on a GPU-enabled node, ensuring that CuPy's GPU-accelerated functionalities are thoroughly evaluated. The GPU node provides the necessary hardware for running CuPy's tests, which heavily rely on GPU acceleration. This stage is triggered after the wheels have been built and uploaded to the scratch space. The GPU node needs to have the appropriate drivers and CUDA toolkit installed to support CuPy's operations.

4. Downloading Wheels from Scratch

Within the test stage, the first step is to download the CuPy wheels from the scratch space. This simulates the installation process that a user would typically follow. By downloading the wheels, the CI pipeline ensures that the installation process is tested as part of the overall testing strategy. This step verifies that the wheels are accessible and that the download process works correctly. The downloaded wheels are then ready for installation in the test environment.

5. Starting a Vanilla Ubuntu Container with GPU Support

To create a controlled and isolated environment, a vanilla Ubuntu container is started. Containers provide a way to package software in a standardized unit for development, shipment, and deployment. A vanilla Ubuntu container provides a clean slate, free from any pre-existing installations or configurations that could interfere with the tests. The --gpus=all flag ensures that the container has access to all available GPUs on the host system. This is crucial for testing CuPy's GPU-accelerated functionalities. The container isolates the test environment, making the tests more reproducible and reliable.

6. Creating a Python Virtual Environment (venv)

A Python virtual environment (venv) is created within the container. A virtual environment is a self-contained directory that contains a Python installation for a particular version of Python, plus a number of additional packages. This isolates the CuPy installation and its dependencies from the system-level Python installation, preventing conflicts and ensuring reproducibility. The virtual environment ensures that the tests are run in a consistent environment, regardless of the system's Python configuration. This is essential for ensuring that the tests are reliable and that the results are meaningful.

7. Installing the CuPy Wheel via pip

The CuPy wheel is installed using pip, the Python package installer. pip is a package management system used to install and manage software packages written in Python. The installation command includes the [ctk] extra, which specifies the CUDA Toolkit dependency. This ensures that the correct CUDA Toolkit version is installed alongside CuPy. For example, the command might look like this: pip install "./cupy_cuda12x-13.6.0-cp313-cp313-manylinux2014_x86_64.whl[ctk]". The [ctk] extra tells pip to install the CUDA Toolkit dependencies specified in the wheel's metadata. This step ensures that CuPy is installed correctly and that all its dependencies are satisfied.

8. Running pytest to Test CuPy against CTK Wheels

Finally, pytest is used to execute the CuPy test suite. pytest is a popular Python testing framework that makes it easy to write and run tests. These tests verify the functionality of CuPy against the specified CUDA Toolkit version. This step ensures that CuPy operates correctly with the intended CUDA Toolkit. The test suite includes a variety of tests that cover different aspects of CuPy's functionality, including GPU array operations, CUDA kernel execution, and compatibility with other libraries. Running the tests ensures that CuPy is working as expected and that any issues are detected and addressed.

Conclusion

Setting up a CI pipeline for wheel-based CuPy environments is a crucial step in ensuring the library's quality and reliability. By following the recipe outlined in this article, developers can establish a robust testing system that covers various CUDA versions and operating systems. This approach not only guarantees a consistent user experience but also facilitates the early detection of potential issues. Remember to refer to trusted resources for further information on CI/CD best practices and technologies. You can check more information about Continuous Integration with this link