-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Closed
Labels
bugBug reportBug report
Description
Describe the bug
When configuring the ABS source as described in documentation, the following error is shown during ingestion run:
Failed to configure the source (abs): 'dict' object has no attribute 'is_abs'
Stack trace:
~~~~ Execution Summary - RUN_INGEST ~~~~
Execution finished with errors.
{'exec_id': 'bf6564a0-e1c3-4edd-a453-aa8d444afa47',
'infos': ['2025-11-17 12:00:30.528159 INFO: Starting execution for task with name=RUN_INGEST',
"2025-11-17 12:01:06.185933 INFO: Failed to execute 'datahub ingest', exit code 1",
'2025-11-17 12:01:06.190365 INFO: Caught exception EXECUTING task_id=bf6564a0-e1c3-4edd-a453-aa8d444afa47, name=RUN_INGEST, '
'stacktrace=Traceback (most recent call last):\n'
' File "/home/datahub/.venv/lib/python3.11/site-packages/acryl/executor/execution/default_executor.py", line 153, in execute_task\n'
' task_event_loop.run_until_complete(task_future)\n'
' File "/usr/lib/python3.11/asyncio/base_events.py", line 654, in run_until_complete\n'
' return future.result()\n'
' ^^^^^^^^^^^^^^^\n'
' File "/home/datahub/.venv/lib/python3.11/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 324, in execute\n'
' await self._execute_with_debug(validated_args, ctx, exec_id)\n'
' File "/home/datahub/.venv/lib/python3.11/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 386, in '
'_execute_with_debug\n'
' self._handle_subprocess_completion(\n'
' File "/home/datahub/.venv/lib/python3.11/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 633, in '
'_handle_subprocess_completion\n'
' raise TaskError("Failed to execute \'datahub ingest\'")\n'
"acryl.executor.execution.task.TaskError: Failed to execute 'datahub ingest'\n"],
'errors': []}
~~~~ Ingestion Logs ~~~~
Setting up venv for plugin 'abs' with version '1.3.1.2'
Creating dynamic venv - this may take a few minutes...
Creating new venv: /tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867
+/usr/bin/uv venv --python /home/datahub/.venv/bin/python3 /tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867
Using CPython 3.11.13 interpreter at: /home/datahub/.venv/bin/python3
Creating virtual environment at: datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867
Installing requirements from: /tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867/requirements.txt
+cat /tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867/requirements.txt
# Generated at 2025-11-17T12:00:30.679054+00:00
acryl-datahub[abs]==1.3.1.2
+/usr/bin/uv pip install -r /tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867/requirements.txt
Using Python 3.11.13 environment at: datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867
Resolved 101 packages in 12.90s
Downloading botocore (13.5MiB)
Downloading pandas (11.6MiB)
Downloading acryl-datahub (2.4MiB)
Downloading aiohttp (1.7MiB)
Downloading numpy (13.9MiB)
Downloading pyarrow (42.9MiB)
Downloading sqlalchemy (3.2MiB)
Downloading cryptography (4.1MiB)
Downloading pydantic-core (1.8MiB)
Building pyspark==3.5.7
Building unicodecsv==0.14.1
Building linear-tsv==1.1.0
Built linear-tsv==1.1.0
Downloading aiohttp
Downloading pydantic-core
Built unicodecsv==0.14.1
Downloading acryl-datahub
Downloading sqlalchemy
Downloading cryptography
Downloading numpy
Downloading botocore
Downloading pandas
Downloading pyarrow
Built pyspark==3.5.7
Prepared 101 packages in 14.44s
Installed 101 packages in 245ms
+ acryl-datahub==1.3.1.2
+ aiohappyeyeballs==2.6.1
+ aiohttp==3.13.2
+ aiosignal==1.4.0
+ annotated-types==0.7.0
+ anyio==4.11.0
+ asgiref==3.10.0
+ attrs==25.4.0
+ avro==1.12.1
+ avro-gen3==0.7.16
+ azure-common==1.1.28
+ azure-core==1.36.0
+ azure-identity==1.25.1
+ azure-storage-blob==12.27.1
+ azure-storage-file-datalake==12.22.0
+ boto3==1.40.74
+ botocore==1.40.74
+ bracex==2.6
+ cached-property==2.0.1
+ cachetools==6.2.2
+ certifi==2025.11.12
+ cffi==2.0.0
+ chardet==5.2.0
+ charset-normalizer==3.4.4
+ click==8.3.1
+ click-default-group==1.2.4
+ click-spinner==0.1.10
+ cryptography==46.0.3
+ dataflows-tabulator==1.54.3
+ deprecated==1.3.1
+ docker==7.1.0
+ et-xmlfile==2.0.0
+ expandvars==1.1.2
+ frozenlist==1.8.0
+ greenlet==3.2.4
+ h11==0.16.0
+ httpcore==1.0.9
+ httpx==0.28.1
+ humanfriendly==10.0
+ idna==3.11
+ ijson==3.4.0.post0
+ isodate==0.7.2
+ jmespath==1.0.1
+ jsonlines==4.0.0
+ jsonref==1.1.0
+ jsonschema==4.25.1
+ jsonschema-specifications==2025.9.1
+ linear-tsv==1.1.0
+ mixpanel==5.0.0
+ more-itertools==10.8.0
+ msal==1.34.0
+ msal-extensions==1.3.1
+ multidict==6.7.0
+ mypy-extensions==1.1.0
+ numpy==2.3.5
+ openpyxl==3.1.5
+ packaging==25.0
+ pandas==2.3.3
+ parse==1.20.2
+ progressbar2==4.5.0
+ propcache==0.4.1
+ psutil==7.1.3
+ py4j==0.10.9.7
+ pyarrow==22.0.0
+ pycparser==2.23
+ pydantic==2.12.4
+ pydantic-core==2.41.5
+ pydeequ==1.5.0
+ pyjwt==2.10.1
+ pyspark==3.5.7
+ python-dateutil==2.9.0.post0
+ python-utils==3.9.1
+ pytz==2025.2
+ pyyaml==6.0.3
+ referencing==0.37.0
+ requests==2.32.5
+ requests-file==3.0.1
+ rfc3986==2.0.0
+ rpds-py==0.29.0
+ ruamel-yaml==0.18.16
+ ruamel-yaml-clib==0.2.15
+ s3transfer==0.14.0
+ sentry-sdk==2.44.0
+ six==1.17.0
+ smart-open==7.5.0
+ sniffio==1.3.1
+ sqlalchemy==2.0.44
+ tableschema==1.21.0
+ tabulate==0.9.0
+ toml==0.10.2
+ typing-extensions==4.15.0
+ typing-inspect==0.9.0
+ typing-inspection==0.4.2
+ tzdata==2025.2
+ ujson==5.11.0
+ unicodecsv==0.14.1
+ urllib3==2.5.0
+ wcmatch==10.1
+ wrapt==2.0.1
+ xlrd==2.0.2
+ yarl==1.22.0
✅ Venv ready at: /tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867
This version of datahub supports report-to functionality
+ exec datahub ingest run -c /tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/recipe.yml --report-to /tmp/datahub/logs/bf6564a0-e1c3-4edd-a453-aa8d444afa47/artifacts/ingestion_report.json
2025-11-17 12:01:03,073 [datahub.masking.bootstrap] INFO: Initializing secret masking infrastructure
2025-11-17 12:01:03,073 [datahub.masking.masking_filter] INFO: Installed SecretMaskingFilter on root logger
2025-11-17 12:01:03,073 [datahub.masking.masking_filter] DEBUG: Wrapped sys.stdout with StreamMaskingWrapper
2025-11-17 12:01:03,073 [datahub.masking.masking_filter] DEBUG: Wrapped sys.stderr with StreamMaskingWrapper
2025-11-17 12:01:03,073 [datahub.masking.masking_filter] DEBUG: Updated 4 logging handlers to use wrapped streams
2025-11-17 12:01:03,073 [datahub.masking.bootstrap] DEBUG: Installed custom exception hook for secret masking
2025-11-17 12:01:03,074 [datahub.masking.bootstrap] INFO: Secret masking infrastructure initialized successfully. Secrets will be registered automatically as they are loaded.
[2025-11-17 12:01:03,074] INFO {datahub.cli.ingest_cli:155} - DataHub CLI version: 1.3.1.2
[2025-11-17 12:01:03,075] INFO {datahub.ingestion.run.pipeline:202} - No sink configured, attempting to use the default datahub-rest sink.
[2025-11-17 12:01:03,105] INFO {datahub.ingestion.run.pipeline:225} - Sink configured successfully. DataHubRestEmitter: configured to talk to http://datahub-gms:8080
[2025-11-17 12:01:04,827] ERROR {datahub.entrypoints:249} - Command failed: Failed to configure the source (abs): 'dict' object has no attribute 'is_abs'
Traceback (most recent call last):
File "/tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867/lib/python3.11/site-packages/datahub/ingestion/run/pipeline.py", line 78, in _add_init_error_context
yield
File "/tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867/lib/python3.11/site-packages/datahub/ingestion/run/pipeline.py", line 247, in __init__
source_class.create(
File "/tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867/lib/python3.11/site-packages/datahub/ingestion/source/abs/source.py", line 167, in create
config = DataLakeSourceConfig.model_validate(config_dict)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867/lib/python3.11/site-packages/pydantic/main.py", line 716, in model_validate
return cls.__pydantic_validator__.validate_python(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867/lib/python3.11/site-packages/datahub/configuration/common.py", line 156, in _track_nesting_context
instance = handler(data)
^^^^^^^^^^^^^
File "/tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867/lib/python3.11/site-packages/datahub/ingestion/source/abs/config.py", line 117, in check_path_specs_and_infer_platform
guessed_platforms = set(
^^^^
File "/tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867/lib/python3.11/site-packages/datahub/ingestion/source/abs/config.py", line 118, in <genexpr>
"abs" if path_spec.is_abs else "file" for path_spec in path_specs
^^^^^^^^^^^^^^^^
AttributeError: 'dict' object has no attribute 'is_abs'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867/lib/python3.11/site-packages/datahub/entrypoints.py", line 236, in main
sys.exit(datahub(standalone_mode=False, **kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867/lib/python3.11/site-packages/click/core.py", line 1485, in __call__
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867/lib/python3.11/site-packages/click/core.py", line 1406, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "/tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867/lib/python3.11/site-packages/click/core.py", line 1873, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867/lib/python3.11/site-packages/click/core.py", line 1873, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867/lib/python3.11/site-packages/click/core.py", line 1269, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867/lib/python3.11/site-packages/click/core.py", line 824, in invoke
return callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867/lib/python3.11/site-packages/datahub/telemetry/telemetry.py", line 490, in wrapper
raise e
File "/tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867/lib/python3.11/site-packages/datahub/telemetry/telemetry.py", line 438, in wrapper
res = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867/lib/python3.11/site-packages/datahub/telemetry/telemetry.py", line 490, in wrapper
raise e
File "/tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867/lib/python3.11/site-packages/datahub/telemetry/telemetry.py", line 438, in wrapper
res = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867/lib/python3.11/site-packages/datahub/upgrade/upgrade.py", line 491, in async_wrapper
ret = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867/lib/python3.11/site-packages/datahub/cli/ingest_cli.py", line 176, in run
pipeline = Pipeline.create(
^^^^^^^^^^^^^^^^
File "/tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867/lib/python3.11/site-packages/datahub/ingestion/run/pipeline.py", line 430, in create
return cls(
^^^^
File "/tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867/lib/python3.11/site-packages/datahub/ingestion/run/pipeline.py", line 243, in __init__
with _add_init_error_context(
File "/usr/lib/python3.11/contextlib.py", line 158, in __exit__
self.gen.throw(typ, value, traceback)
File "/tmp/datahub/ingest/bf6564a0-e1c3-4edd-a453-aa8d444afa47/venv-abs-50e815d22ff70867/lib/python3.11/site-packages/datahub/ingestion/run/pipeline.py", line 82, in _add_init_error_context
raise PipelineInitError(f"Failed to {step}: {e}") from e
datahub.ingestion.run.pipeline.PipelineInitError: Failed to configure the source (abs): 'dict' object has no attribute 'is_abs'
To Reproduce
Steps to reproduce the behavior:
- Configure ABS
- Run ingestion
- Check failed logs
Screenshots
If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
- OS: MacOS (Docker)
- Browser Safari
- Version 18.5
Additional context
Add any other context about the problem here.
Metadata
Metadata
Assignees
Labels
bugBug reportBug report