Skip to content

Bug: get_degrees() fails after hypergraph() due to hardcoded column names #766

@lmeyerov

Description

@lmeyerov

Bug: get_degrees() fails after hypergraph() due to hardcoded column names

Summary

get_degrees() is hardcoded to look for 'src' and 'dst' edge columns, but fails when used after hypergraph() which creates edges with different column names. It should use the graph's edge bindings instead.

Environment

  • PyGraphistry: v0.43.2
  • Python: 3.10.16

Reproduction

import pandas as pd
import graphistry

# Create test data
nodes_df = pd.DataFrame([
    ['Pineville Park', 'Pineville', 'Park'],
    ['Pineville Grocery store', 'Pineville', 'Grocery store'],
    ['Maplewood Pet shop', 'Maplewood', 'Pet shop'],
], columns=['name', 'town', 'point_of_interest'])

edges_df = pd.DataFrame([
    ['Pineville Park', 'Pineville Grocery store'],
], columns=['source', 'destination'])

g = graphistry.nodes(nodes_df, 'name').edges(edges_df, 'source', 'destination')

# Step 1: Create hypergraph (changes edge structure)
g2 = g.hypergraph(entity_types=['town'], direct=True)

# Step 2: Try to get degrees - this fails
g3 = g2.get_degrees()  # ❌ KeyError: "None of [Index(['src', 'dst'], dtype='object')] are in the [columns]"

Error Message

KeyError: "None of [Index(['src', 'dst'], dtype='object')] are in the [columns]"

Root Cause

get_degrees() appears to be hardcoded to look for columns named 'src' and 'dst':

# Likely in get_degrees() implementation:
degrees = edges_df[['src', 'dst']].stack().value_counts()  # ❌ Assumes these column names

After hypergraph(), edges have different column names (e.g., 'node1', 'node2', or custom names), so get_degrees() fails.

Expected Behavior

get_degrees() should use the graph's edge bindings:

# Should use:
src_col = g._source  # Get actual source column binding
dst_col = g._destination  # Get actual destination column binding
degrees = edges_df[[src_col, dst_col]].stack().value_counts()

Impact

  • Cannot use get_degrees() after hypergraph() in any workflow
  • Affects GFQL chains: [hypergraph(...), call('get_degrees')]
  • Limits composability of graph operations

Workaround

Manually bind edges back to 'src'/'dst' before calling get_degrees():

g2 = g.hypergraph(entity_types=['town'], direct=True)

# Workaround: Rename edge columns
edges_renamed = g2._edges.rename(columns={g2._source: 'src', g2._destination: 'dst'})
g2_fixed = g2.edges(edges_renamed, 'src', 'dst')

g3 = g2_fixed.get_degrees()  # ✅ Now works

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions