Skip to content

Conversation

@BulkBeing
Copy link
Contributor

@BulkBeing BulkBeing commented Oct 16, 2025

The protoc generated code for Python doesn't use relative imports protocolbuffers/protobuf#1491. We were modifying the import statements after generating code with protoc:

sed -i.bak -e 's/^\(import.*_pb2\)/from . \1/' pynumaflow/proto/*/*.py

This PR uses the option in protoc to specify the package path. This results in generated code using absolute import paths and the need for above modification after code generation.

Tested by building the example images and running below pipeline:

apiVersion: numaflow.numaproj.io/v1alpha1
kind: Pipeline
metadata:
  name: simple-source
spec:
  vertices:
    - name: in
      source:
        udsource:
          container:
            # A simple user-defined async source
            image: simple-source-py:stable
            imagePullPolicy: Never
      limits:
        readBatchSize: 2
    - name: flatmap
      udf:
        container:
          image: map-flatmap-py:stable
          imagePullPolicy: Never
          env:
            - name: PYTHONDEBUG
              value: "true"
      containerTemplate:
        env:
          - name: NUMAFLOW_DEBUG
            value: "true" # DO NOT forget the double quotes!!!
    - name: inbuilt
      sink:
        log: {}
    - name: out
      sink:
        udsink:
          container:
            args:
            - python
            - example.py
            image: async-sink-log-py:stable
            imagePullPolicy: Never
            env:
              - name: PYTHONDEBUG
                value: "true"
              - name: INVOKE
                value: "func_handler"
  edges:
    - from: in
      to: flatmap
    - from: flatmap
      to: inbuilt
    - from: flatmap
      to: out

@codecov
Copy link

codecov bot commented Oct 16, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.63%. Comparing base (cdf8d8e) to head (81b6116).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #255   +/-   ##
=======================================
  Coverage   93.63%   93.63%           
=======================================
  Files          65       65           
  Lines        2907     2907           
  Branches      152      152           
=======================================
  Hits         2722     2722           
  Misses        135      135           
  Partials       50       50           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@BulkBeing BulkBeing marked this pull request as ready for review October 16, 2025 12:42
@vigith
Copy link
Member

vigith commented Oct 16, 2025

The protoc generated code for Python doesn't use relative imports protocolbuffers/protobuf#1491.

This PR uses the option in protoc to specify the package path. This results in generated code using absolute import paths and the need for above modification after code generation.

If protoc does not support it, how are we using the package path in protoc? I am confused by the above 2 contradicting statements.

@BulkBeing
Copy link
Contributor Author

If we generate Python code from proto files to a sub-directory (protos/):

python -m grpc_tools.protoc -I protos/ --python_out=./rpc/ --pyi_out=./rpc/ --grpc_python_out=./rpc protos/hello_world.proto

3 files are generated in proto/ directory for each .proto file: route_guide_pb2.py, route_guide_pb2.pyi and route_guide_pb2_grpc.py.
The *_grcp.py file will have an import like this:

from route_rpc import route_guide_pb2 as route__rpc_dot_route__guide__pb2

This won't work since you will be running Python interpreter from its parent directory. So we rewrite this import as:

from . import route_guide_pb2 as route__rpc_dot_route__guide__pb2

Instead, we can use the option in -I to specify the package path:

-I rpc=protos/ --python_out=. --pyi_out=. --grpc_python_out=.

Then the generated code becomes:

from rpc import hello_world_pb2 as rpc_dot_hello__world__pb2

In our case:

-Ipynumaflow/proto/accumulator=pynumaflow/proto/accumulator

Generated code:

from pynumaflow.proto.accumulator import accumulator_pb2 as pynumaflow_dot_proto_dot_accumulator_dot_accumulator__pb2

This matches with how we import in our regular files. Eg pynumaflow/accumulator/async_server.py:

from pynumaflow.accumulator.servicer.async_servicer import AsyncAccumulatorServicer
from pynumaflow.info.types import ServerInfo, ContainerType, MINIMUM_NUMAFLOW_VERSION
from pynumaflow.proto.accumulator import accumulator_pb2_grpc

@BulkBeing BulkBeing merged commit 321942d into main Oct 17, 2025
12 checks passed
@BulkBeing BulkBeing deleted the fix-import-paths branch October 17, 2025 02:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants