Skip to content

Conversation

@SYaoJun
Copy link
Contributor

@SYaoJun SYaoJun commented Feb 9, 2026

Reason for this PR

Currently there are some files(.cpp/.h) not incorperate into the graphar-clformat checklist, such as python/rust directories.

file(GLOB_RECURSE FILES_NEED_FORMAT "src/graphar/*.h" "src/graphar/*.cc"
                                    "test/*.h" "test/*.cc"
                                    "examples/*.h" "examples/*.cc"
                                    "benchmarks/*.h" "benchmarks/*.cc")
file(GLOB_RECURSE FILES_NEED_LINT "src/graphar/*.h" "src/graphar/*.cc"
                                  "test/*.h" "test/*.cc"
                                  "examples/*.h" "examples/*.cc"
                                  "benchmarks/*.h" "benchmarks/*.cc")

add_custom_target(graphar-clformat
                  COMMAND clang-format --style=file -i ${FILES_NEED_FORMAT}
                  COMMENT "Running clang-format."
                  VERBATIM)

What changes are included in this PR?

  1. Make the pre-commit check all .cpp/.cc/.h/.hpp directly.

  2. Fix all not format files automatically by pre-commit run clang-format -a

Are these changes tested?

Are there any user-facing changes?

@SYaoJun
Copy link
Contributor Author

SYaoJun commented Feb 9, 2026

issue: #845

@SYaoJun SYaoJun force-pushed the 845_fix_clang_format branch from e88423c to 33d490f Compare February 9, 2026 13:46
@codecov-commenter
Copy link

codecov-commenter commented Feb 9, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.21%. Comparing base (6c802a1) to head (3cfe001).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##               main     #847   +/-   ##
=========================================
  Coverage     80.21%   80.21%           
  Complexity      615      615           
=========================================
  Files            93       93           
  Lines         10257    10257           
  Branches       1049     1049           
=========================================
  Hits           8228     8228           
  Misses         1789     1789           
  Partials        240      240           
Flag Coverage Δ
cpp 71.52% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@SYaoJun SYaoJun force-pushed the 845_fix_clang_format branch 3 times, most recently from 8371dc0 to 31ec000 Compare February 10, 2026 00:36
@SYaoJun
Copy link
Contributor Author

SYaoJun commented Feb 10, 2026

please review the code @yangxk1

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the repo’s formatting enforcement so clang-format is applied via pre-commit to a broader set of C/C++ sources (beyond the previous limited globbing), and aligns CI to run the same pre-commit clang-format hook. The PR also includes the resulting clang-format-only reformatting across several C++ files (including bindings, JNI, core C++ code, and vendored thirdparty code).

Changes:

  • Broaden clang-format pre-commit hook matching to all .(cc|cpp|h|hpp) files.
  • Replace the prior CI clang-format logic with a pre-commit run clang-format --all-files-style execution.
  • Apply clang-format updates across affected C++ sources (Python bindings, JNI, filesystem, and thirdparty libraries).

Reviewed changes

Copilot reviewed 8 out of 11 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
python/src/bindings/types_binding.cc Formatting-only adjustments to match clang-format output.
python/src/bindings/high_level_binding.cc Large formatting-only reflow of pybind11 bindings.
maven-projects/java/src/main/cpp/ffi/jni_org_apache_graphar_arrow_ArrowTableStatic.cc Formatting-only include/order and wrapping updates.
cpp/thirdparty/simple-uri-parser/uri_parser.h clang-format reflow of vendored URI parser header.
cpp/thirdparty/mini-yaml/yaml/Yaml.hpp clang-format reflow of vendored YAML header.
cpp/thirdparty/mini-yaml/yaml/Yaml.cpp clang-format reflow of vendored YAML implementation (very large diff).
cpp/src/graphar/filesystem.h Formatting-only signature wrapping.
cpp/src/graphar/filesystem.cc Formatting-only wrapping around template specialization/instantiation lines.
.pre-commit-config.yaml Expands clang-format hook scope to all C++ extensions.
.github/workflows/ci.yml Switch CI clang-format step to run via pre-commit.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

hooks:
- id: clang-format
files: ^python/
files: \.(cc|cpp|h|hpp)$
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The clang-format hook now matches all C++ extensions repo-wide, which also pulls in vendored code under cpp/thirdparty (this repo appears to intentionally exclude thirdparty from tooling, e.g., the CMake graphar-clformat target and codecov ignore list). Consider excluding cpp/thirdparty (and any other vendored/generated trees) via the hook's exclude: or by tightening the files: regex, so formatting enforcement stays focused on maintained sources.

Suggested change
files: \.(cc|cpp|h|hpp)$
files: ^cpp/.*\.(cc|cpp|h|hpp)$
exclude: ^cpp/thirdparty/

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to exclude the thirdparty.


#include <algorithm>
#include <string>
#include <unordered_map>
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This header uses std::string_view (when C++17+) and std::tuple/std::tie, but it doesn't include <string_view> or ; it also calls std::isalnum/std::tolower/std::strtol without including /. To keep the header self-contained and portable across standard library implementations, add the missing standard includes near the top.

Suggested change
#include <unordered_map>
#include <unordered_map>
#include <string_view>
#include <tuple>
#include <cctype>
#include <cstdlib>

Copilot uses AI. Check for mistakes.
Comment on lines 135 to 136
auto auth_string = uri.substr(2, pos);
auto rem = uri.substr(pos + 2);
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse_authority computes pos = uri.substr(2).find('/') and then unconditionally does rem = uri.substr(pos + 2). When no '/' exists (pos == npos), pos + 2 wraps (size_t) and rem becomes an incorrect substring (e.g., s3://bucket yields path "/bucket" and later becomes "bucket/bucket"). Handle the npos case explicitly by setting rem to an empty string_view/string and taking auth_string as the remainder.

Suggested change
auto auth_string = uri.substr(2, pos);
auto rem = uri.substr(pos + 2);
uri::string_view_type auth_string;
uri::string_view_type rem;
if (pos == uri::npos) {
auth_string = uri.substr(2);
rem = uri::string_view_type{};
} else {
auth_string = uri.substr(2, pos);
rem = uri.substr(pos + 2);
}

Copilot uses AI. Check for mistakes.
Comment on lines +396 to +401
/// template specialization for std::string
template Result<IdType> FileSystem::ReadFileToValue<IdType>(
const std::string&) const noexcept;
/// template specialization for std::string
template Status FileSystem::WriteValueToFile<IdType>(const IdType&,
const std::string&) const
noexcept;
template Status FileSystem::WriteValueToFile<IdType>(
const IdType&, const std::string&) const noexcept;
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These explicit template instantiation comments say "template specialization for std::string" but the instantiations shown are for IdType (ReadFileToValue and WriteValueToFile). Updating/removing the misleading comments would avoid confusion when maintaining these explicit instantiations.

Copilot uses AI. Check for mistakes.
Comment on lines +93 to 95
pre-commit install
pre-commit run clang-format -a

Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pre-commit install isn't needed in CI (it only installs git hooks) and can be dropped to reduce side effects and runtime; running pre-commit run clang-format --all-files directly is sufficient.

Suggested change
pre-commit install
pre-commit run clang-format -a
pre-commit run clang-format --all-files

Copilot uses AI. Check for mistakes.
@SYaoJun
Copy link
Contributor Author

SYaoJun commented Feb 10, 2026

I will exculde the thirdparty directory and update the PR.

@SYaoJun SYaoJun force-pushed the 845_fix_clang_format branch from 31ec000 to b404ea6 Compare February 10, 2026 08:19
Copy link
Contributor

@yangxk1 yangxk1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM~ thank you! @SYaoJun

@yangxk1 yangxk1 merged commit 02bd3fe into apache:main Feb 10, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants