-
Notifications
You must be signed in to change notification settings - Fork 191
feat/945-persistent-storage-for-peerstore #946
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat/945-persistent-storage-for-peerstore #946
Conversation
|
@Winter-Soren : Great effort. Reviewing the PR. Re-ran the CI/CD pipeline today after updating the branch. Wish if you could resolve the CI/CD issues. |
…ithub.com/Winter-Soren/py-libp2p into feat/945-persistent-storage-for-peerstore
|
@Winter-Soren : HI Soham. Thank you for resolving the CI/CD issues. 1 CI/CD test is failing. We investigated it and documented it at #949 . @yashksaini-coder , @sumanjeet0012 and @acul71 are fixing it. We will soon have all the CI/CD tests passing. @Winter-Soren : Will re-run CI/CD pipeline once the issue is fixed today or in a couple of days. |
|
Can I get access for this to check and work, @seetadev @Winter-Soren |
Hey @yashksaini-coder, |
|
@Winter-Soren : Updated the branch and re-ran the CI/CD pipeline. There seems to be one test issue. Looking forward to collaborating with you and resolving it soon. |
|
@Winter-Soren : The CI/CD issue got resolved. Wonderful. @pacrob: Hi Paul. Wish to have your feedback on this PR. On the same note, wish to share that I will ask @Winter-Soren and @yashksaini-coder to include a comprehensive test suite and also add a newsfragment. |
|
I added some questions. In this case, I do think example usage would be important. Tests and a newsfragment are also needed here. |
|
Thanks @pacrob for the detailed review and helpful comments 🙏 @Winter-Soren — please take a look at these points and update the PR soon. The switch to Trio (instead of asyncio) is important for consistency with the rest of the lib, so let’s refactor accordingly. Also, make sure the PeerData fields (last_identified, ttl, and latmap) are properly handled so we don’t lose data integrity. Adding a couple of example usages, along with tests and a newsfragment, will round this out nicely and make it easier for others to build on your work. Once you’ve made the updates, tag me and @pacrob for a quick re-review — this is very close to merge-ready! 🚀 |
|
Hi @Winter-Soren def _persist_peer_data_sync(self, peer_id: ID, peer_data: PeerData) -> None:
"""Synchronously persist peer data to the datastore."""
# Schedule persistence for the next async operation
# This is a workaround for the sync/async split
# The data will be persisted when the next async method is called
passWhat do you think? Full review here: PR 946 Review: Persistent Storage for PeerStoreOverviewPR: #946 - feat/945-persistent-storage-for-peerstore SummaryThis PR implements persistent storage for py-libp2p's peerstore, addressing the critical limitation where peer data is lost on process restart. The implementation follows go-libp2p's Key Components1. Datastore Abstraction (
|
| Aspect | Go-libp2p pstoreds | Python Implementation |
|---|---|---|
| Sync/Async | Fully synchronous | Hybrid sync/async (potential data loss) |
| Persistence | Immediate persistence | Deferred persistence via async calls |
| Data Format | Protobuf serialization | Pickle serialization |
| Cache Strategy | ARC cache with configurable size | Simple dict with no eviction |
| Error Handling | Comprehensive error propagation | Basic error handling |
Go Implementation Details
// Go uses immediate persistence
func (r *addrsRecord) flush(write ds.Write) error {
// Immediate write to datastore
return write.Put(context.TODO(), key, data)
}
// Python uses deferred persistence
def _persist_peer_data_sync(self, peer_id: ID, peer_data: PeerData) -> None:
# Currently a no-op - data loss risk!
passTechnical Analysis
Strengths ✅
- Architectural Soundness: Follows go-libp2p patterns with clean datastore abstraction
- Complete Implementation: All PeerData fields persisted including
last_identified,ttl,latmap - Multiple Backends: SQLite, LevelDB, RocksDB, Memory support
- Good Documentation: Comprehensive examples and docstrings
- Trio Integration: Fixed asyncio → trio migration per reviewer feedback
Critical Issues ⚠️
-
Sync Persistence Gap
def _persist_peer_data_sync(self, peer_id: ID, peer_data: PeerData) -> None: """Synchronously persist peer data to the datastore.""" # This is currently a no-op - needs implementation pass
The Core Problem: The persistent peerstore API is synchronous (methods like
add_addrs()return immediately), but persistence is asynchronous (data gets saved to disk later). This creates a dangerous mismatch.What Happens Now:
- User calls
peerstore.add_addrs(peer_id, [addr], 3600) - Data is immediately stored in memory cache ✅
_persist_peer_data_sync()is called but does nothing ❌- If process crashes before async persistence occurs → data is lost 💥
The Risk Window: There's a gap between when data is "added" (in memory) and when it's actually persisted (to disk). During this window, the data exists only in volatile memory and can be lost on any crash or restart.
Go Comparison: Go-libp2p persists data immediately on every operation - no gap, no data loss risk.
Impact: This breaks the fundamental promise of "persistent" storage and makes the peerstore unreliable for production use.
- User calls
-
Data Consistency
- Sync methods modify in-memory cache only
- Persistence happens later via async calls
- Creates potential inconsistency between memory and storage
Review Comments Analysis
pacrob's Concerns (Status)
-
Trio Consistency ✅ RESOLVED
- Fixed:
asyncio.Lock()→trio.Lock()in all datastore implementations - Fixed: Examples use
trio.run()instead ofasyncio.run()
- Fixed:
-
trio.from_thread Usage
⚠️ PARTIALLY ADDRESSED- Reduced usage but sync/async split remains problematic
- Sync methods operate on in-memory cache only
-
PeerData Fields Persistence ✅ RESOLVED
- Fixed: Now persisting
last_identified,ttl,latmapfields - Added
_get_additional_key()for serialization
- Fixed: Now persisting
-
Examples and Documentation ✅ RESOLVED
- Added comprehensive example and newsfragment
Recommendations
Critical Fixes Required
-
Implement Sync Persistence
def _persist_peer_data_sync(self, peer_id: ID, peer_data: PeerData) -> None: """Synchronously persist peer data to the datastore.""" # Current: no-op (data loss risk!) # Needed: immediate persistence or background task
-
Add Integration Tests
- Test persistence across restarts
- Test with real database backends
- Test error handling scenarios
Suggested Improvements
- Add configuration options for cache size, cleanup intervals
- Implement connection pooling for database backends
- Add migration utilities from in-memory peerstore
Conclusion
Recommendation: APPROVE with critical fixes
This is a well-architected implementation that addresses a critical need in py-libp2p. The code follows go-libp2p patterns and provides comprehensive persistent storage.
Required Before Merge
- Fix
_persist_peer_data_sync()- currently a no-op with data loss risk - Add integration tests for persistence across restarts
- Add comprehensive error handling for persistence failures
Files Changed
- Added: 2,609 lines across multiple files
- Key Files:
persistent_peerstore.py(724 lines), datastore implementations, examples, newsfragment
The implementation significantly improves py-libp2p's resilience and brings it closer to feature parity with other libp2p implementations.
…tion risk in _persist_peer_data_sync
Thank you so much @acul71 for detailed feedback. Hi @seetadev @pacrob @acul71, |
|
@Winter-Soren : Thank you so much for asking this question. Appreciate it. If the hybrid async/sync approach poses a real risk of data loss or corruption, I’d recommend going with the fully synchronous go-libp2p approach. Reliability should take priority over performance here, unless there’s a strong reason to keep async for throughput. On a similar note, please organize the code in a modular architecture such that if someone wishes to switch from sync to async, they can write the async mechanism and make a switch. This switch should avoid data corruption. Async is important for streaming and high throughput work with webrtc or UDP. |
Optional Dependencies for Persistent PeerstoreThe persistent peerstore implementation supports multiple backend storage engines. While SQLite and in-memory backends work out of the box, LevelDB and RocksDB backends require additional system dependencies that can be challenging to install across different platforms. Supported Backends✅ Always Available
|
… implementations - Updated method signatures in IAsyncAddrBook and IAsyncProtoBook to improve readability by formatting parameters across multiple lines. - Removed unused imports of AsyncGenerator and Coroutine from various files. - Cleaned up whitespace in multiple files to maintain consistency and improve code readability. - Ensured that docstrings remain clear and informative after changes.
…atastore implementations - Updated method signatures in IAsyncCertifiedAddrBook to use Envelope | None for better type clarity. - Cleaned up whitespace in various files, including peerstore and datastore implementations, to enhance code readability. - Simplified iterator usage in LevelDBDatastoreSync to yield directly from the iterator. - Ensured consistent formatting in the __all__ declarations across datastore modules.
- Added type ignore comments for optional imports of plyvel and rocksdb to improve type checking. - Updated the RocksDBDatastoreSync class to include a type ignore comment for the dynamically imported _rocksdb module, ensuring better compatibility with type checkers.
- Updated backoff handling logic in the GossipSub class to always add backoff, regardless of topic presence in the mesh. - Adjusted the wait time in the test_handle_prune function to 0.5 seconds for better compatibility on Windows systems.
PR Review Summary: Persistent Storage for Peerstore #946Summary: ✅ Good Points
❗ Critical Issues (Must Fix)
|
AI Pull Request Review: PR #946 - feat/945-persistent-storage-for-peerstorePR: #946 1. Summary of ChangesThis PR implements persistent storage for py-libp2p's peerstore, addressing issue #945. The implementation enables peer data (addresses, keys, metadata, protocols, latency metrics) to survive process restarts, addressing a critical limitation where all peer information was lost on restart. Key Components Added:
Files Changed:
Breaking Changes:None identified. This is a new feature addition that maintains backward compatibility with existing in-memory peerstore. 2. StrengthsArchitecture & Design
Code Quality
Testing
Documentation
3. Issues FoundCritical3.1 Security: Unsafe Pickle Usage
3.2 Resource Management: Unreliable
|
…ntext managers, and configurable sync
|
Thank you so much @yashksaini-coder and @acul71 for your comprehensive feedback!!! I have successfully addressed ALL the critical feedback. ✅ Original Critical Issues - ALL COMPLETED
✅ Additional Issues Fixed
🔧 Type Checking Issues Resolved
🧪 Testing Status
The persistent peerstore implementation is now production-ready with:
|
…ation modules - Cleaned up whitespace and improved formatting for better readability in example usage of persistent peerstore. - Updated serialization methods to ensure consistent formatting and clarity in the codebase. - Enhanced context manager exit method signatures across various datastore implementations for improved readability.
- Simplified imports for public and private key serialization by removing unnecessary key type mappings. - Improved comments for clarity regarding the limitations of reconstructing PeerRecordState. - Enhanced code readability through consistent formatting and whitespace adjustments.
What was wrong?
Currently, py-libp2p only provides an in-memory peerstore implementation that loses all peer data (addresses, keys, metadata, protocols) when the process restarts. This limits the resilience and performance of py-libp2p nodes, especially for long-running applications that need to maintain peer information across restarts.
Issue #945
How was it fixed?
Implemented a datastore-agnostic persistent peer storage system following the same architectural pattern as go-libp2p's
pstoredspackage. The solution provides:Summary of approach
Core Components:
libp2p/peer/datastore/): Pluggable interface supporting multiple backendslibp2p/peer/persistent_peerstore.py): Full async implementation with memory cachinglibp2p/peer/persistent_peerstore_factory.py): Convenient creation methodsSupported Backends:
To-Do
Cute Animal Picture