Skip to content

Conversation

@yunyezhang-work
Copy link
Contributor

@yunyezhang-work yunyezhang-work commented Nov 20, 2025

What changes were proposed in this pull request?

In big data production environments, customers create a massive number of policies, often reaching hundreds of thousands or even millions. Exporting the entire set of policies for disaster recovery would result in an enormous data volume and extremely slow import speeds into the backup cluster. Our current experimental data shows that importing 10,000 policies via the API is very memory-intensive and takes approximately 15 minutes. Importing 100,000 policies via the API will take 2.5h or even longer.

With an even larger number of policies, memory consumption will increase significantly, and insufficient memory can cause import interruptions. Therefore, we recommend modifying the API to allow for segmented export. This will save memory and ensure data reliability when importing to other clusters for disaster recovery.

How was this patch tested?

To manually test this feature, you can send an HTTP request to the ranger. Using a shell command as an example:

Without the segmentation parameter, calling the export API getPoliciesInJson will export all policies. As shown in the figure, there are 18 policies in this environment for hdfs-xxx.
curl -u$USER:$PASSWORD -XGET "http://$RANGER_HOST:$RANGER_PORT/service/plugins/policies/exportJson?serviceName=$SERVICE&checkPoliciesExists=true" -v -o export.json

Adding the segmentation parameter will export the policies for the specified start and end position range. As shown in the figure, policies 1-5 of hdfs-xxx are exported.
curl -u$USER:$PASSWORD -XGET "http://$RANGER_HOST:$RANGER_PORT/service/plugins/policies/exportJson?serviceName=$SERVICE&checkPoliciesExists=true&beginIndex=$BEGIN_INDEX&offsetIndex=$OFFSET_INDEX" -v -o export_${BEGIN_INDEX}_${OFFSET_INDEX}.json
image

@yunyezhang-work yunyezhang-work changed the title Support export policies in a segmented manner RANGER-5406 Support export policies in a segmented manner Nov 20, 2025
@yunyezhang-work yunyezhang-work changed the title RANGER-5406 Support export policies in a segmented manner RANGER-5406: Support export policies in a segmented manner Nov 20, 2025
@yunyezhang-work
Copy link
Contributor Author

@mneethiraj @kumaab
Hello. Could you please help review these two PR? It seems GitHub doesn't assign viewers. We hope to have more interaction with the open-source community and look forward to your reply.
#741
#739

@kumaab
Copy link
Contributor

kumaab commented Nov 26, 2025

Thank you @yunyezhang-work for the patch! please raise a PR for the master branch it is the branch for all dev work.

return ret;
}

private List<RangerPolicy> cutRangerPolicyList(List<RangerPolicy> policyList, SearchFilter filter) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested name: getRangerPoliciesInRange

int startIndex = filter.getBeginIndex();
int pageSize = filter.getOffsetIndex();
int toIndex = Math.min(startIndex + pageSize, totalCount);
LOG.info("==>totalCount: " + totalCount + " startIndex: " + startIndex + " pageSize: " +pageSize + " toIndex: " + toIndex);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid string concatenation, use String.format()

LOG.info("Invalid or Unsupported sortType : " + sortType);
}
} else {
LOG.info("Invalid or Unsupported sortBy property : " + sortBy);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yunyezhang-work yunyezhang-work changed the base branch from ranger-2.3 to master November 26, 2025 12:31
@yunyezhang-work yunyezhang-work changed the base branch from master to ranger-2.3 November 26, 2025 12:31
public static final String UPDATE_TIME = "updateTime"; // sort
public static final String START_INDEX = "startIndex";
public static final String BEGIN_INDEX = "beginIndex";
public static final String OFFSET_INDEX = "offsetIndex";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think OFFSET is more meaning full than OFFSET_INDEX, offset is not index. What do you think ?

private int startIndex;
private int maxRows = Integer.MAX_VALUE;
private int beginIndex = -1;
private int offsetIndex = -1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you've added new fields to the SearchFilter class, don't forget to modify the copy constructor (public SearchFilter(SearchFilter other)) accordingly to ensure the new attributes are properly copied.

}

public void setBeginIndex(int beginIndex) {
this.beginIndex = beginIndex;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should validate that beginIndex >= 0. What’s your opinion?

@yunyezhang-work
Copy link
Contributor Author

@kumaab @vyommani Thank you for your suggestions. The changes have been made at the following link.#748

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants