-
Notifications
You must be signed in to change notification settings - Fork 217
Open
Labels
Description
Bug: get_degrees() fails after hypergraph() due to hardcoded column names
Summary
get_degrees()
is hardcoded to look for 'src'
and 'dst'
edge columns, but fails when used after hypergraph()
which creates edges with different column names. It should use the graph's edge bindings instead.
Environment
- PyGraphistry: v0.43.2
- Python: 3.10.16
Reproduction
import pandas as pd
import graphistry
# Create test data
nodes_df = pd.DataFrame([
['Pineville Park', 'Pineville', 'Park'],
['Pineville Grocery store', 'Pineville', 'Grocery store'],
['Maplewood Pet shop', 'Maplewood', 'Pet shop'],
], columns=['name', 'town', 'point_of_interest'])
edges_df = pd.DataFrame([
['Pineville Park', 'Pineville Grocery store'],
], columns=['source', 'destination'])
g = graphistry.nodes(nodes_df, 'name').edges(edges_df, 'source', 'destination')
# Step 1: Create hypergraph (changes edge structure)
g2 = g.hypergraph(entity_types=['town'], direct=True)
# Step 2: Try to get degrees - this fails
g3 = g2.get_degrees() # ❌ KeyError: "None of [Index(['src', 'dst'], dtype='object')] are in the [columns]"
Error Message
KeyError: "None of [Index(['src', 'dst'], dtype='object')] are in the [columns]"
Root Cause
get_degrees()
appears to be hardcoded to look for columns named 'src'
and 'dst'
:
# Likely in get_degrees() implementation:
degrees = edges_df[['src', 'dst']].stack().value_counts() # ❌ Assumes these column names
After hypergraph()
, edges have different column names (e.g., 'node1'
, 'node2'
, or custom names), so get_degrees()
fails.
Expected Behavior
get_degrees()
should use the graph's edge bindings:
# Should use:
src_col = g._source # Get actual source column binding
dst_col = g._destination # Get actual destination column binding
degrees = edges_df[[src_col, dst_col]].stack().value_counts()
Impact
- Cannot use
get_degrees()
afterhypergraph()
in any workflow - Affects GFQL chains:
[hypergraph(...), call('get_degrees')]
- Limits composability of graph operations
Workaround
Manually bind edges back to 'src'/'dst' before calling get_degrees()
:
g2 = g.hypergraph(entity_types=['town'], direct=True)
# Workaround: Rename edge columns
edges_renamed = g2._edges.rename(columns={g2._source: 'src', g2._destination: 'dst'})
g2_fixed = g2.edges(edges_renamed, 'src', 'dst')
g3 = g2_fixed.get_degrees() # ✅ Now works
Related
- Hypergraph chaining support added in v0.43.1 (PR fix(gfql): handle schema-changing operations in chains #762)
- This bug prevents full composability of hypergraph with other operations