Skip to content

feat: add create_data_element MCP tool#27

Closed
connor-savage wants to merge 3 commits intobot/mainfrom
factory/create_data_element
Closed

feat: add create_data_element MCP tool#27
connor-savage wants to merge 3 commits intobot/mainfrom
factory/create_data_element

Conversation

@connor-savage
Copy link

Summary

Add create_data_element MCP tool to Chip.

Generated by chip-factory pipeline.

Generated by chip-factory pipeline.
@connor-savage connor-savage requested a review from a team as a code owner February 27, 2026 00:15
@svc-snyk-github-jira
Copy link

svc-snyk-github-jira commented Feb 27, 2026

Snyk checks have passed. No issues have been found so far.

Status Scanner Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues
Code Security 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

Copy link

@EricWarnerCollibra EricWarnerCollibra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: create_data_element tool

Build: ✅ Pass | Tests: ✅ Pass (8/8) | Lint: ✅ Pass | MCP: ✅ Registered & callable

Critical Bug: Error response field mismatch

File: pkg/clients/create_data_element_client.go:42

The CreateDataElementErrorResponse struct maps Message to json:"message", but the actual Collibra REST API returns error messages in the userMessage field. This was confirmed by testing against the live API:

// Actual API response for duplicate name (400):
{"statusCode":400,"titleMessage":"Value not allowed","userMessage":"An asset with signifier 'X' already exists...","errorCode":"termAlreadyExists"}

// Actual API response for bad request (400):
{"statusCode":400,"titleMessage":"Bad Request","userMessage":"There was an error while processing JSON content"}

// Actual API response for forbidden (403):
{"statusCode":403,"titleMessage":"Not authorized","userMessage":"You are not allowed to perform this action...","errorCode":"resourceNoPermission"}

Because the struct looks for message (which doesn't exist in the response), errResp.Message is always empty, and the code falls through to the generic "unexpected status: %d" message. This means all API errors lose their descriptive messages.

Fix: Change json:"message" to json:"userMessage" in CreateDataElementErrorResponse.Message, and update the unit tests to use userMessage in their mock responses.

Convention Violation: JSON field naming uses snake_case instead of camelCase

File: pkg/tools/create_data_element/tool.go:17-20,26

All other tools in the codebase use camelCase for JSON field names in Input/Output structs:

  • get_asset_details: json:"assetId"
  • add_data_classification_match: json:"assetId", json:"classificationId"
  • keyword_search: json:"resourceTypeFilters"
  • pull_data_contract_manifest: json:"dataContractId"

But create_data_element uses snake_case: json:"domain_id", json:"display_name", json:"status_id", json:"resource_type".

Fix: Change to camelCase (domainId, displayName, statusId, resourceType) for consistency.

Improvement: Whitespace-only name validation

File: pkg/tools/create_data_element/tool.go:41

The tool validates input.Name == "" but doesn't handle whitespace-only names (e.g., " "). Testing against the live API shows that whitespace-only names cause a 500 server error. Adding strings.TrimSpace before the empty check would prevent this and provide a clear client-side error message.

Adversarial Test Results

Test Result
Empty name ("") ✅ Returns "name is required"
Missing name field ✅ Schema validation rejects
Missing domain_id ✅ Schema validation rejects
Malformed UUID domain_id ⚠️ Returns "unexpected status: 400" (should include API error message)
Duplicate name ⚠️ Returns "unexpected status: 400" (should include API error message)
Whitespace-only name (" ") ❌ Returns "unexpected status: 500" — API crashes
Unicode/emoji name ✅ Created successfully
HTML injection in name ✅ Created successfully (API accepts)
Invalid status_id ⚠️ Returns "unexpected status: 400" (should include API error message)
Optional fields included ✅ Created successfully with displayName and statusId

Summary

The tool structure is clean and follows most Chip conventions correctly. However, the error response field mismatch is a real bug that makes all API error messages invisible to the user. This must be fixed before merging.

- Fix error response field mismatch: change json:"message" to
  json:"userMessage" to match actual Collibra REST API error format,
  making API error messages visible to users
- Fix JSON field naming: change snake_case to camelCase in Input/Output
  structs (domainId, displayName, statusId, resourceType) for
  consistency with other Chip tools
- Add whitespace-only name validation using strings.TrimSpace to prevent
  server 500 errors from whitespace-only names
- Add WhitespaceOnlyName test case

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@EricWarnerCollibra EricWarnerCollibra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: create_data_element Tool

Convention Compliance ✅

The tool follows all Chip tool package conventions:

  • Package name matches directory (create_data_element)
  • Exports NewTool, Input, Output correctly
  • NewTool takes *http.Client, returns *chip.Tool[Input, Output]
  • handler is unexported, returns chip.ToolHandlerFunc[Input, Output]
  • Proper json and jsonschema tags on Input/Output
  • Optional fields use omitempty
  • Tool name is snake_case
  • Errors returned as Output{}, err — no panics
  • No direct mcp-go imports; uses pkg/chip and pkg/clients
  • Permissions includes "dgc.ai-copilot"

Build / Tests / Lint ✅

  • go build ./cmd/chip — passes
  • go test ./pkg/tools/create_data_element/... — all 9 tests pass
  • go test ./... — all packages pass
  • golangci-lint run ./... — clean

MCP Validation ✅

  • Tool registers correctly in factory mcp-list
  • Happy path creates an asset and returns {"id": "...", "resourceType": "Asset"}

Adversarial Testing ✅ (8 edge cases tested)

Test Case Result
Empty name "" "name is required"
Whitespace-only name " " "name is required" (TrimSpace working)
Missing domainId field ✅ Schema validation: missing properties: ["domainId"]
Missing name field ✅ Schema validation: missing properties: ["name"]
Invalid domain UUID "not-a-valid-uuid" ⚠️ Generic error: "There was an error while processing JSON content"
Duplicate name ✅ Clear error: "An asset with signifier '...' already exists"
Special characters / Unicode / HTML ✅ Creates successfully, no panic
Non-existent domain UUID ✅ Clear error: "domain(s) was/were not found"
Invalid statusId ⚠️ Generic error: "There was an error while processing JSON content"
Empty object {} ✅ Schema validation: missing properties: ["name" "domainId"]

Requested Change: Add UUID format validation for domainId and statusId

File: pkg/tools/create_data_element/tool.go, lines 46-48

The tool validates name (non-empty, trims whitespace) and domainId (non-empty), but does not validate that domainId or statusId are valid UUID format. When an invalid UUID like "not-a-valid-uuid" is passed, the Collibra API returns a generic, unhelpful error: "There was an error while processing JSON content".

Other tools in this repo already establish the pattern of validating UUID inputs locally:

  • get_asset_details/tool.go line 39: uuid.Parse(input.AssetID)
  • pull_data_contract_manifest/tool.go line 34: uuid.Parse(input.DataContractID)

Please add uuid.Parse() validation for domainId (always) and statusId (when provided) to return clear, actionable error messages like "domainId is not a valid UUID: not-a-valid-uuid".

This gives the LLM caller immediate feedback to correct the input, instead of a cryptic API error that requires guessing what went wrong.

Validates domainId and statusId as proper UUIDs before calling the API,
giving the LLM caller immediate actionable feedback instead of a cryptic
API error. Follows the existing uuid.Parse pattern from get_asset_details
and pull_data_contract_manifest tools.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@EricWarnerCollibra EricWarnerCollibra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: create_data_element Tool

🔴 Critical Bug: Wrong Asset Type ID

The DataElementTypeID constant in pkg/tools/create_data_element/tool.go line 15 is incorrect:

const DataElementTypeID = "00000000-0000-0000-0000-000000031302"

This UUID maps to the "System" asset type, NOT "Data Element". I verified this directly against the Collibra REST API:

GET /rest/2.0/assetTypes/00000000-0000-0000-0000-000000031302
→ name: "System"

The correct UUID for the "Data Element" asset type is:

00000000-0000-0000-0000-000000031026

Verified via:

GET /rest/2.0/assetTypes?name=Data+Element&nameMatchMode=EXACT
→ id: "00000000-0000-0000-0000-000000031026", name: "Data Element"

This means every asset created by this tool is the wrong type. The constant should be:

const DataElementTypeID = "00000000-0000-0000-0000-000000031026"

The unit tests pass but they don't catch this because they mock the API and only verify the type ID matches the constant — they don't validate the constant itself is correct.

🟡 Suggestion: Add Name Length Validation

The Collibra API enforces a name length limit of 1-2000 characters (verified via adversarial testing with a 5000-char name). Consider adding local validation for this in the handler to provide a clearer, faster error message rather than relying on the API:

if len(input.Name) > 2000 {
    return Output{}, fmt.Errorf("name must not exceed 2000 characters (got %d)", len(input.Name))
}

✅ Convention Compliance (All Passing)

  • Package name matches directory name (create_data_element)
  • Exports exactly: NewTool, Input, Output
  • NewTool takes *http.Client, returns *chip.Tool[Input, Output]
  • handler is unexported, returns chip.ToolHandlerFunc[Input, Output]
  • Input/Output use json and jsonschema tags
  • Optional fields use json:"field,omitempty"
  • Tool name is snake_case (create_data_element)
  • Errors returned as Output{}, err — no panics
  • No direct mcp-go imports
  • Client calls go through pkg/clients/
  • Permissions includes "dgc.ai-copilot"

✅ Build / Test / Lint

  • Build: ✅ go build ./cmd/chip passes
  • Tests: ✅ All 11 unit tests pass (go test ./pkg/tools/create_data_element/...)
  • Full suite: ✅ go test ./... passes (all packages)
  • Lint: ✅ golangci-lint run ./... clean

✅ MCP Validation

  • Tool is registered and listed via factory mcp-list
  • Happy path works: creates asset, returns {id, resourceType}
  • Error responses are properly surfaced with isError: true

✅ Adversarial Testing Results

Test Case Result
Empty name "" "name is required"
Whitespace-only name " " "name is required" (trimmed)
Missing domainId field ✅ Schema validation rejects
Missing name field ✅ Schema validation rejects
Invalid domainId (not UUID) "domainId is not a valid UUID: ..."
Invalid statusId (not UUID) "statusId is not a valid UUID: ..."
Non-existent domain UUID "API error (status 404): ..."
Duplicate name in domain "API error (status 400): ..."
Unicode/special chars in name ✅ Creates successfully
Very long name (5000 chars) ✅ API rejects with clear message
Optional displayName ✅ Creates successfully
Forbidden domain "API error (status 403): ..."

Summary

Good implementation overall — clean code, solid conventions, comprehensive tests, and proper error handling. However, the wrong asset type ID is a critical bug that must be fixed before merging. The tool currently creates "System" assets instead of "Data Element" assets.

@connor-savage connor-savage deleted the factory/create_data_element branch February 27, 2026 03:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants