Skip to content

Fix Windows build compatibility (MSVC cl.exe and clang-cl)#201

Open
lorisercole wants to merge 49 commits into
wavefunction91:masterfrom
lorisercole:fix/msvc-build
Open

Fix Windows build compatibility (MSVC cl.exe and clang-cl)#201
lorisercole wants to merge 49 commits into
wavefunction91:masterfrom
lorisercole:fix/msvc-build

Conversation

@lorisercole

@lorisercole lorisercole commented May 7, 2026

Copy link
Copy Markdown
Contributor

Summary

Enable GauXC to build cleanly with both MSVC cl.exe and clang-cl on Windows,
at /W3 plus mandatory /w1XXXX warning flags, with zero warnings.

Compiler compatibility fixes

  • __PRETTY_FUNCTION____FUNCSIG__ under _MSC_VER && !__clang__ guard
  • __builtin_popcount → C++20 std::popcount (MSVC has no GCC builtins)
  • C99 [restrict] array params → *__restrict pointers in 14 rys files
  • __attribute__((always_inline))FORCE_INLINE macro (__forceinline on MSVC)
  • C99 VLAs → _malloca/_freea in rys_rw.c and rys_xrw.c
  • __attribute__((__aligned__(64)))alignas(64) in 20 integral_*.cxx files
  • __restrict____restrict in integral_1_0.cxx
  • Add missing #include <string> in reduction_driver.hpp
  • Make Molecule::operator== const for C++20 compatibility

MSVC warning fixes (all fixed at source)

  • C4013: add missing #include <stdlib.h> for exit() in gau2grid
  • C4018: signed/unsigned mismatch in gau2grid and atomic_radii.cxx
  • C4100: unused parameters — [[maybe_unused]] or (void) casts in GauXC sources, gau2grid, and rys
  • C4101: unreferenced locals in gau2grid generated code
  • C4244: implicit narrowing — pm_to_bohr return type changed from long double to double; explicit casts at std::floor, std::max, and int→double sites
  • C4267: size_tint narrowing at BLAS call boundaries — introduce int32_t locals; fix 0ulsize_t{0} in std::accumulate (Windows unsigned long is 32-bit)
  • C4242/C4244: ::toupper returns int; use explicit-cast lambda in std::transform
  • C4389: cast uint32_t to int32_t before std::count/std::find on int32_t containers
  • C4701: value-initialize coeff_secondary_arr and EXC/EXC_ref
  • C4530: add /EHsc as PUBLIC on gauxc target

clang-cl warning fixes

  • Narrow __FUNCSIG__ guard to exclude clang-cl (-Wlanguage-extension-token)
  • Use size_t loop indices where compared against size_t bounds (-Wsign-compare)
  • Remove unused cart_array typedef (-Wunused-local-typedef)
  • Add default case to switch statements missing XCWeightAlg::NOTPARTITIONED (-Wswitch)
  • Suppress -Wunused-variable for clang-cl targets (pervasive in conditional-compilation code)
  • Suppress -Wunknown-pragmas for gau2grid (uses MSVC __pragma(loop(ivdep)))

gau2grid code generator updates

  • Update Python generators (c_generator.py, RSH.py, c_util_generator.py) so regenerated C source files are warning-free under MSVC and clang-cl
  • Fix _remove_unused_decls to skip comments when checking variable references
  • Scope #include <stdlib.h> to gau2grid_helper.c only

Exception refactor

  • Replace strdup/_strdup with stored std::string member in all 7 exception classes — fixes portability (no POSIX dependency) and eliminates memory leak in what()

Static HDF5 linking on Windows

  • Patch HighFive libdeps to use H5_BUILT_AS_STATIC_LIB (CMake's FindHDF5 incorrectly sets H5_BUILT_AS_DYNAMIC_LIB)
  • Link transitive static dependencies (zlib, szip/aec, shlwapi)

Dependency updates

  • Update ExchCXX and IntegratorXX hashes to include their MSVC fix branches

Requires:

GauXC was only buildable with GCC, Clang, and clang-cl. This commit fixes all
compilation and linking issues when building with MSVC's native `cl.exe` compiler.

## Compiler compatibility fixes
- `__PRETTY_FUNCTION__` → `__FUNCSIG__` (`exceptions.hpp`):
  MSVC does not support `__PRETTY_FUNCTION__`; use `__FUNCSIG__` under `_MSC_VER` guard.
- `__builtin_popcount` → `std::popcount` (`exx_screening.cxx`):
  MSVC does not provide GCC builtins; use C++20 `<bit>` header instead.
- C99 `[restrict]` array params → `*__restrict` pointers (14 rys files):
  MSVC does not support the C99 array parameter syntax. The `*__restrict` pointer
  form is portable across GCC, Clang, and MSVC.
- `__attribute__((always_inline))` → `FORCE_INLINE` macro (`rys_integral.c`):
  Maps to `__forceinline` on MSVC and `__attribute__((always_inline))` on GCC/Clang.
  Also replaces the GNU statement-expression `MIN` macro with a simple ternary.
- C99 VLAs → `_malloca`/`_freea` (`rys_rw.c`, `rys_xrw.c`):
  MSVC does not support VLAs. Uses `_malloca`/`_freea` (stack with heap fallback)
  under `_MSC_VER` guards.
- `__attribute__((__aligned__(64)))` → `alignas(64)` (20 `integral_*.cxx` files):
  Portable C++11 alignment specifier, works on GCC, Clang, and MSVC.
- `__restrict__` → `__restrict` (`integral_1_0.cxx`):
  `__restrict__` is GCC/Clang-only; `__restrict` is portable across all three compilers.
- Missing `#include <string>` (`reduction_driver.hpp`):
  MSVC does not provide `<string>` transitively through other headers.
- Non-const `operator==` (`molecule.hpp`):
  Made `Molecule::operator==` `const` to fix C++20 ambiguity with synthesized
  reverse candidates on MSVC.

## Static HDF5 linking fixes (`src/external/CMakeLists.txt`)

- HighFive incorrectly propagates `H5_BUILT_AS_DYNAMIC_LIB`:
  CMake's `FindHDF5` module sets this define on Windows when `HDF5_USE_STATIC_LIBRARIES`
  is unset, regardless of whether the library is actually static. HighFive then
  propagates it via its `libdeps` INTERFACE target. The fix patches `libdeps` after
  `FetchContent` to replace `H5_BUILT_AS_DYNAMIC_LIB` with `H5_BUILT_AS_STATIC_LIB`.
- HDF5 transitive static dependencies:
  When linking HDF5 statically on Windows, its dependencies (zlib, szip/aec, shlwapi)
  must be linked explicitly as they are not auto-resolved.
warning C4996: 'strdup': The POSIX name for this item is deprecated. Instead, use the ISO C and C++ conformant name: _strdup.
- Add /EHsc PUBLIC to gauxc target so all consumers (test executables)
  inherit it and avoid C4530 from MSVC STL headers
- Add /EHsc PRIVATE to Catch2WithMain (compiled FetchContent target that
  does not inherit from gauxc)
- Remove /W2 (overrode /W3 from build script), /wd4100 and /wd4242
  (suppressed warnings enabled by mandatory /w14100 and /w14242 flags)
- Keep /wd4101 and /wd5219 (not in mandatory warning list)
Build the error message once in the constructor and store it in a
std::string what_msg_ member. what() returns what_msg_.c_str(),
avoiding the POSIX-deprecated strdup/non-portable _strdup entirely.

Applies to all 7 exception classes: generic_gauxc_exception,
cuda_exception, cublas_exception, cutlass_exception, hip_exception,
hipblas_exception, magma_exception.
…H_BUILD_STATIC_LIBRARY=ON

Catch2 v2 creates Catch2WithMain only when CATCH_BUILD_STATIC_LIBRARY is
ON (off by default). gauxc_test inherits /EHsc transitively from gauxc
PUBLIC /EHsc, so no explicit override is needed.
Add [[maybe_unused]] to parameters that are unused in non-OpenMP or
non-gencon code paths (e.g., ReductionOp, ks_settings, ldtps, delim).
- atomic_radii.cxx: change pm_to_bohr to return double (was long double),
  eliminating implicit long double->double narrowing at all call sites
- config_obara_saika.hpp: explicit cast of std::floor result to int
- shell_batched_xc_integrator.cxx: explicit cast of std::max result to uint32_t
- standalone_driver.cxx: explicit cast of int N_EL_ref to double
- atomic_radii.cxx: change pm_to_bohr to return double (was long double);
  add static_cast<size_t> to fix signed/unsigned comparison with vector::size()
- tests/grid_test.cxx: cast int batch_sz to size_t in CHECK comparison
Add explicit casts where size_t values are passed to int-typed parameters:
- gau2grid_collocation.cxx: cast npts/ncomp to unsigned long at gau2grid
  API boundary; keep size_t for pointer arithmetic to avoid 32-bit overflow
  on Windows (where unsigned long is 32-bit)
- reference_local_host_work_driver.cxx: introduce int32_t locals for all
  size_t values passed to BLAS functions (blas::gemm, blas::dot, etc.)
- Integrator headers: static_cast<int32_t> for task sizes and shell counts
  passed to BLAS and submat map functions
C4389 (signed/unsigned ==) and C4242 (int->char) fire inside MSVC's own
xutility/algorithm headers when compiled with /W3 — not in our source.
Suppress both with /wd4389 /wd4242 (PRIVATE, not propagated to consumers).

Also add build*/ to .gitignore to cover build-msvc/ and build-clang-cl/.
- xc_task.hpp: cast points.size() to int32_t in both merge_with overloads
- basisset.hpp: cast size() to int32_t in nshells() return
- misc.hpp: cast std::min(A.size(), B.size()) to uint32_t
- integrate_den.hpp: cast task sizes to int32_t
- reference_local_host_work_driver.cxx: add inbf locals and cast npts/nbe_cart
  at sph_trans/submat/inc_by_submat_atomic call boundaries; fix ioff = i*nbf
  (size_t) -> i*inbf (int32_t*int32_t)
- real_solid_harmonics.hpp: cast intmax_t results of integral_*_factorial
  and binomial_coefficient to double at assignment sites
- basisset_map.hpp: cast ptrdiff_t result of std::distance to int32_t
- reference_replicated_xc_host_integrator_dd_psi.hpp: cast int64_t ldPsi
  to int at blas::gemm call sites
- parse_basis.cxx: value-initialize coeff_secondary_arr{} so MSVC can
  prove it is always initialized before use in the gencon branch
- standalone_driver.cxx: initialize EXC and EXC_ref to 0.0 at declaration
All remaining C4267, C4244, C4242, C4389 warnings fire exclusively inside
MSVC standard library headers (xutility, utility, numeric, algorithm) when
GauXC templates are instantiated. They are not from GauXC source code.
Suppress them privately so diagnostics from our own code remain visible.
…t32_t container

basisset_map.hpp nshells_with_l and l_purity pass a uint32_t l to
std::count/std::find over a vector<int32_t>, triggering a signed/unsigned
comparison inside MSVC xutility. Cast l to int32_t at the call site.
::toupper takes int and returns int; storing the result into a char iterator
causes C4242 (int->char) and C4244 in MSVC. Use a lambda that casts the
output explicitly, matching the pattern required by the standard.
- basisset_map.hpp: cast size_t st_idx/range_end to int32_t when inserting
  into vector<int32_t> and vector<pair<int32_t,int32_t>>
- xc_task.hpp: use size_t{0} init in std::accumulate (was 0ul, which is
  32-bit on Windows)
- load_balancer_impl.cxx: same 0ul -> size_t{0} fix in total_npts()
- molmeta.cxx: same 0ul -> size_t{0} fix in sum_atomic_charges accumulate
- exx_screening.cxx: cast shell index loop vars to int32_t when pushing
  into vector<pair<int32_t,int32_t>> and vector<int32_t>
All previously suppressed warnings (C4242, C4244, C4267, C4389) are now
fixed at their source. Remove the suppressions so MSVC continues to catch
similar regressions.
Five more std::transform(... ::toupper) call sites not covered by the
previous fix. The same pattern applies: ::toupper returns int; storing
that into a char iterator is C4242. Use a lambda that casts explicitly.
- xc_integrator.cxx: 0ul -> size_t{0} in std::accumulate init (0ul is
  32-bit on Windows, causing C4244 when lambda returns int64_t)
- ini_input.cxx: add static_cast<char> to std::toupper return in lambdas
  (missing cast caused C4242 int->char)
clang-cl defines both _MSC_VER and __clang__; __FUNCSIG__ is a
MSVC-only extension that clang-cl warns about. Narrow the guard so
only MSVC (not clang-cl) uses it.
Add -Wno-unused-variable to gauxc, gauxc_test, and standalone_driver
under clang-cl (MSVC + Clang) to suppress unused variable warnings
without adding [[maybe_unused]] annotations throughout the codebase.
…per.c

The MSVC-only preprocessor branch included <malloc.h> but not <stdlib.h>,
leaving exit() undeclared. The clang-cl branch already had both.
Cast int nprim to (unsigned long) in loop comparisons against unsigned
long iterators. Change unsigned int start_shift to unsigned long to
match the unsigned long operands in its computation.
Remove unused double A, AX/AY/AZ, AXX/.../AZZ, and AXXX/.../ZZZ
declarations from collocation functions where the angular momentum
level is too low to need them.
… and rys

Add (void)param casts for parameters required by the function signature
but unused in the implementation: ncart/nspherical/ncart_out in
gau2grid_transform.c, llA in rys compute_vrr3, ntgqp in rys_xrw.
Update c_generator.py (C4013, C4018, C4101), RSH.py and
c_util_generator.py (C4100) so regenerated sources match the
hand-fixed generated files.
- Remove trailing semicolons from (void) casts (cg.write appends them)
- Skip comments when checking if declared variables are used
- Only add #include <stdlib.h> to helper.c (for exit()), not all files
- C4101: remove unused out_shift variable from cart_sum functions
- C4100: silence unused trans parameter in block_copy
The clang-cl branch uses MSVC's __pragma(loop(ivdep)) which clang-cl
does not recognize. Suppress -Wunknown-pragmas for Clang builds rather
than changing the pragma, keeping the same hint used by MSVC.
@lorisercole

Copy link
Copy Markdown
Contributor Author

Build tested with both MSVC cl and clang-cl compilers, with the following flags to 1CS corporate policies:

-DCMAKE_CXX_FLAGS="/W3 /w14018 /w14055 /w14100 /w14102 /w14127 /w14146 /w14242 /w14244 /w14245 /w14254 /w14267 /w14302 /w14306 /w14308 /w14310 /w14389 /w14509 /w14510 /w14512 /w14532 /w14533 /w14610 /w14611 /w14700 /w14701 /w14703 /w14789 /w14995 /w14996" `
-DCMAKE_CXX_FLAGS_RELEASE="/O2 /DNDEBUG" `

+ same for C flags

Pick up fix for -Wunused-parameter on backend in kernel_factory.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant