Skip to content

Conversation

@Shamzik
Copy link
Contributor

@Shamzik Shamzik commented Dec 4, 2025

The goal is to fix a bug in the preg_regplace_callback function.

The error was related to the calculation of the replacement string. It should be calculated for each match. Previously, it was calculated only for the first match and used as the replacement string for every match.

@Shamzik Shamzik force-pushed the kshamazov/preg_regplace_callback branch 4 times, most recently from 9b0be9b to e0af042 Compare December 8, 2025 10:48
@Shamzik Shamzik changed the title fix preg_regplace_callback function [k2] fix preg_regplace_callback function Dec 8, 2025
@Shamzik Shamzik force-pushed the kshamazov/preg_regplace_callback branch 6 times, most recently from ee80ea3 to 86a4e75 Compare December 10, 2025 21:55
@Shamzik Shamzik marked this pull request as ready for review December 11, 2025 09:57
@Shamzik Shamzik force-pushed the kshamazov/preg_regplace_callback branch from 86a4e75 to 01b4486 Compare December 11, 2025 17:25
@Shamzik Shamzik self-assigned this Dec 11, 2025
@Shamzik Shamzik added the k2 k2 related label Dec 11, 2025
@Shamzik Shamzik added this to the next milestone Dec 11, 2025
@Shamzik Shamzik requested a review from PetrShumilov December 12, 2025 15:49
@DrDet DrDet modified the milestones: 14.12.2025, next Dec 12, 2025
@apolyakov
Copy link
Contributor

Shouldn't this PR just fix the issue in preg_replace_callback function? Why are there so many changes?

@Shamzik Shamzik requested a review from PetrShumilov December 18, 2025 16:25
@Shamzik Shamzik requested a review from apolyakov December 19, 2025 11:21
Copy link
Contributor

@apolyakov apolyakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current regex could benefit from some restructuring for better readability and maintainability. It’s a bit difficult to follow the logic as it stands

co_return f$preg_replace(pattern, replacement, subject, limit, opt_count);
kphp::regex::details::Info regex_info{pattern, {subject.c_str(), subject.size()}, {}};

if (!kphp::regex::details::valid_regex_flags(flags, kphp::regex::PREG_NO_FLAGS)) [[unlikely]] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure these flags are enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zend says

flags can be a combination of the PREG_OFFSET_CAPTURE and PREG_UNMATCHED_AS_NULL flags, which influence the format of the matches array.

The matches array is the array that will be passed to the callback function. The preg_replace_callback function in KPHP accepts the argument callable(string[] $x):string $callback . Thus, the callback function expects array to be an array of strings. However, this is only possible if all flags are unset. In the old KPHP runtime, the flags argument wasn't even defined in this function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't mean we need to drop support for flags. One day, the signature of the callback function may change, and that day will be not so easy to find a reason why we don't support these flags

if (!kphp::regex::details::valid_regex_flags(flags, kphp::regex::PREG_NO_FLAGS)) [[unlikely]] {
co_return Optional<string>{};
}
if (!compile_regex(regex_info)) [[unlikely]] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand where it comes from... Please, specify a qualified name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

return true;
}

std::optional<array<mixed>> split_regex(kphp::regex::details::Info& regex_info, int64_t limit, bool no_empty, bool delim_capture,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is split_regex in anonymous namespace while compile_regex is in details one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it is not required to be accessed from the header, it is only used in the f$preg_split function.

bool is_offset_capture, bool is_unmatched_as_null) noexcept;

template<std::invocable<array<string>> F>
coro::task<bool> replace_callback(Info& regex_info, F callback, uint64_t limit) noexcept {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add kphp

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

k2 k2 related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants