feat(catalog): enrich catalog interface with more methods#102
feat(catalog): enrich catalog interface with more methods#102ChaomingZhangCN wants to merge 19 commits intoalibaba:mainfrom
Conversation
src/paimon/core/table/table.h
Outdated
| const std::string& table_name) | ||
| : schema_(schema), database_(database), table_name_(table_name) {} | ||
|
|
||
| virtual ~Table() = default; |
There was a problem hiding this comment.
Will Table have subclasses? It seems like many functions are declared as virtual.
There was a problem hiding this comment.
Pull request overview
This PR expands the C++ Catalog public interface to better align with the Java catalog API (issue #84), and begins implementing a subset of the newly added methods for the filesystem-backed catalog.
Changes:
- Extends
include/paimon/catalog/catalog.hwith many additional catalog APIs (databases/tables/views/partitions/version-management). - Implements and tests
GetTable,DropDatabase,DropTable, andRenameTableinFileSystemCatalog. - Introduces new
Database(public header) andTable/Viewabstractions (currently undersrc/).
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
include/paimon/catalog/catalog.h |
Adds many new catalog methods and supporting type aliases/forward declarations. |
include/paimon/catalog/database.h |
Adds a new Database interface type for catalog APIs. |
src/paimon/core/catalog/file_system_catalog.h |
Extends filesystem catalog class with new overrides (DropDatabase, DropTable, RenameTable, GetTable). |
src/paimon/core/catalog/file_system_catalog.cpp |
Implements GetTable, DropDatabase, DropTable, RenameTable. |
src/paimon/core/catalog/file_system_catalog_test.cpp |
Adds unit tests covering the newly implemented filesystem catalog methods. |
src/paimon/core/table/table.h |
Adds a basic Table abstraction used by GetTable. |
src/paimon/core/view/view.h |
Adds a View interface abstraction (not yet implemented by filesystem catalog). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| class Database; | ||
| class Table; | ||
| class View; | ||
| class Schema; | ||
| class Snapshot; | ||
| class PartitionStatistics; | ||
| class Tag; | ||
| class Identifier; |
There was a problem hiding this comment.
This public header forward-declares Table, View, Snapshot, Tag, etc., and returns/accepts them via std::shared_ptr<> in the public API. Since only include/ is installed (not src/), API consumers won’t have access to these type definitions, and destroying a std::shared_ptr<IncompleteType> (created in the library with the default deleter) typically requires the complete type in the consumer TU. Please either move these interface types into public headers under include/, or adjust the API to avoid owning shared_ptr to incomplete types (e.g., return metadata/IDs, or use an exported custom deleter pattern).
| #pragma once | ||
|
|
||
| #include <map> | ||
| #include <memory> |
There was a problem hiding this comment.
This header uses std::optional in Comment() but does not include <optional>, which can cause compilation failures depending on include order. Add #include <optional> to this header.
| #include <memory> | |
| #include <memory> | |
| #include <optional> |
| class PAIMON_EXPORT Database { | ||
| public: | ||
| /// ================== Table Metadata ===================== | ||
|
|
||
| /// A name to identify this database. | ||
| virtual std::string Name() const = 0; | ||
|
|
There was a problem hiding this comment.
Database declares virtual methods but does not declare a virtual destructor. Deleting a derived instance through a Database* (or via smart pointers) would be undefined behavior. Add virtual ~Database() = default; (or a protected non-virtual destructor if you want to prevent deletion via base pointer).
| /// Get the table-level options associated with this schema. | ||
| /// @return Options | ||
| virtual const std::map<std::string, std::string>& Options() const = 0; | ||
|
|
||
| /// Get an optional comment describing the table. | ||
| /// @return The table comment if set, or std::nullopt otherwise. |
There was a problem hiding this comment.
The docstrings here refer to “table-level options” and “table comment”, but this interface is for a database. Please update the wording (database-level options/comment) to match the actual type and avoid confusing generated API docs.
| /// Get the table-level options associated with this schema. | |
| /// @return Options | |
| virtual const std::map<std::string, std::string>& Options() const = 0; | |
| /// Get an optional comment describing the table. | |
| /// @return The table comment if set, or std::nullopt otherwise. | |
| /// Get the database-level options associated with this database. | |
| /// @return Options | |
| virtual const std::map<std::string, std::string>& Options() const = 0; | |
| /// Get an optional comment describing the database. | |
| /// @return The database comment if set, or std::nullopt otherwise. |
|
|
||
| #include <chrono> | ||
| #include <map> | ||
| #include <memory> |
There was a problem hiding this comment.
Catalog now uses std::optional in multiple method signatures (e.g., RollbackTo/CreateBranch/CreateTag) but the header does not include <optional>, which will cause compilation failures for translation units that include this header without already including <optional>. Add #include <optional> here.
| #include <memory> | |
| #include <memory> | |
| #include <optional> |
include/paimon/catalog/catalog.h
Outdated
| /// @return A result containing true if commit succeeded, or an error status. | ||
| virtual Result<bool> CommitSnapshot(const Identifier& identifier, const std::string& table_uuid, | ||
| const std::shared_ptr<Snapshot>& snapshot, | ||
| const std::vector<PartitionStatistics>& statistics) { |
There was a problem hiding this comment.
PartitionStatistics is only forward-declared here and appears inside std::vector<PartitionStatistics> in the public API. Standard containers generally require the element type to be complete, so this will either fail to compile or force users to include internal src/ headers. Consider moving PartitionStatistics into an installed public header (under include/), or change the API to use a public/opaque representation (e.g., a serializable struct, shared_ptr to a public interface, etc.).
| const std::vector<PartitionStatistics>& statistics) { | |
| const std::vector<std::shared_ptr<PartitionStatistics>>& statistics) { |
include/paimon/catalog/catalog.h
Outdated
| /// Get tag for table. | ||
| /// | ||
| /// @param identifier Path of the table, cannot be system name. | ||
| /// @param tag_name Tag name | ||
| /// @return A result containing the tag information, or an error status. | ||
| virtual Result<std::shared_ptr<Tag>> GetTag(const Identifier& identifier, | ||
| const std::string& tag_name) const { | ||
| return Status::NotImplemented("GetTag not implemented"); |
There was a problem hiding this comment.
The PR description lists GetTag returning TagInfo, but the public API here declares GetTag returning std::shared_ptr<Tag>. Please align the PR description and the public interface (either update the description or adjust the API) to avoid confusion and documentation drift.
| virtual std::string Query() = 0; | ||
|
|
||
| /// Loads the schema of view. | ||
| virtual std::shared_ptr<Schema> GetSchema() = 0; |
There was a problem hiding this comment.
Query() and GetSchema() are getter-style APIs but are not marked const, which prevents calling them on a const View& and is inconsistent with Name()/FullName() being const. Consider making these methods const (e.g., virtual std::string Query() const = 0; and virtual std::shared_ptr<Schema> GetSchema() const = 0;).
| virtual std::string Query() = 0; | |
| /// Loads the schema of view. | |
| virtual std::shared_ptr<Schema> GetSchema() = 0; | |
| virtual std::string Query() const = 0; | |
| /// Loads the schema of view. | |
| virtual std::shared_ptr<Schema> GetSchema() const = 0; |
src/paimon/core/table/table.h
Outdated
| } | ||
|
|
||
| /// Loads the latest schema of table. | ||
| virtual std::shared_ptr<Schema> LatestSchema() { |
There was a problem hiding this comment.
LatestSchema() returns the cached schema pointer and does not appear to mutate state, but it is not marked const. Making it const will improve const-correctness and allow calling it through const std::shared_ptr<Table>& / const Table&.
| virtual std::shared_ptr<Schema> LatestSchema() { | |
| virtual std::shared_ptr<Schema> LatestSchema() const { |
| /// Interface of a database in a catalog. | ||
| class PAIMON_EXPORT Database { | ||
| public: | ||
| /// ================== Table Metadata ===================== |
There was a problem hiding this comment.
virtual ~Database() = default;
| return Status::Invalid(fmt::format("Cannot drop system table {}.", identifier.ToString())); | ||
| } | ||
|
|
||
| PAIMON_ASSIGN_OR_RAISE(bool exist, TableExists(identifier)); |
There was a problem hiding this comment.
support drop external path here~
|
Could you also please address the Copilot suggestions? They seem mostly worth applying. Thanks! |
Purpose
Close #84
API and Format
Documentation