feat(compass-collection): Output Validation Followup - Mock Data Generator LLM CLOUDP-347048 #7365

ncarbon · 2025-09-23T16:42:18Z

Description

This pull request introduces more validation and rendering improvements for handling faker method arguments in the mock data generator modal. The main changes include adding comprehensive validation logic for faker arguments and method names, updating UI components to display faker arguments, and refactoring validation logic out of the module to a shared utility.

Validation and Utility Enhancements:

Added areFakerArgsValid and improved isValidFakerMethod utility functions in utils.ts to thoroughly validate faker arguments (including nested and object types) and method names, enforcing limits on argument length, string size, and value types.
Refactored the schema validation logic in collection-tab.ts to use the new isValidFakerMethod utility, ensuring faker method and argument validation is consistent and centralized. [1] [2] [3]

Type and Interface Updates:

Extended the FakerArg type to support nested arrays, enabling more flexible representation of faker arguments.

UI and Rendering Improvements:

Updated FakerMappingSelector to render TextInput components for each faker argument, supporting various types (string, number, boolean, arrays, and objects) and displaying them in a read-only format. [1] [2] [3]
Updated FakerSchemaEditorContent to pass the active faker arguments to the mapping selector for display. [1] [2]

Testing:

Added comprehensive unit tests for areFakerArgsValid and isValidFakerMethod in utils.spec.ts, covering edge cases such as deeply nested structures, invalid argument types, and method existence.
Added an integration test to ensure that excessively large faker arguments are not displayed in the modal.

Checklist

New tests and/or benchmarks are included
Documentation is changed or added
If this change updates the UI, screenshots/videos are added and a design review is requested
I have signed the MongoDB Contributor License Agreement (https://www.mongodb.com/legal/contributor-agreement)

Motivation and Context

Bugfix
New feature
Dependency update
Misc

Open Questions

Dependents

Types of changes

Backport Needed
Patch (non-breaking change which fixes an issue)
Minor (non-breaking change which adds functionality)
Major (fix or feature that would cause existing functionality to change)

…p - Mock Data Generator

...mpass-collection/src/components/mock-data-generator-modal/mock-data-generator-modal.spec.tsx

Copilot

Pull Request Overview

This PR enhances the mock data generator modal with comprehensive validation and rendering improvements for faker method arguments. The changes focus on improving argument validation, centralizing validation logic, and enabling UI display of faker arguments.

Adds comprehensive validation for faker arguments including nested structures, type checking, and size limits
Refactors validation logic from collection-tab.ts to a shared utility module for better maintainability
Updates UI components to display faker arguments as read-only text inputs with proper type handling

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
collection-tab.ts	Refactors validation to use centralized utility function
utils.ts	Introduces comprehensive faker argument and method validation utilities
utils.spec.ts	Adds extensive unit tests for validation functions
script-generation-utils.ts	Extends FakerArg type to support nested arrays
mock-data-generator-modal.spec.tsx	Adds integration test for oversized argument handling
faker-schema-editor-screen.tsx	Passes faker arguments to mapping selector
faker-mapping-selector.tsx	Implements UI rendering for faker arguments

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

packages/compass-collection/src/components/mock-data-generator-modal/faker-mapping-selector.tsx

packages/compass-collection/src/components/mock-data-generator-modal/utils.ts

codeowners-service-app · 2025-09-23T17:20:33Z

Assigned gagik for team compass-developers because mabaasit is out of office.

jcobis · 2025-09-23T19:16:38Z

packages/compass-collection/src/components/mock-data-generator-modal/utils.spec.ts

+        expect(result.fakerArgs).to.deep.equal([]);
+      });
+
+      it('returns false for valid method with invalid arguments and fallback fails', () => {


Shouldn't this be invalid arguments with valid method falls back to true and strips args? That's what's happening I think, which is why this test is failing right now

Yep, forgot to update it.

jcobis · 2025-09-23T19:17:41Z

packages/compass-collection/src/components/mock-data-generator-modal/utils.spec.ts

+        expect(result.fakerArgs).to.deep.equal([]);
+      });
+
+      it('returns false for method with invalid fakerArgs', () => {


My comment applies here too I believe. Both tests should expect isValid: true and fakerArgs: []

packages/compass-collection/src/components/mock-data-generator-modal/utils.ts

jcobis · 2025-09-23T19:30:49Z

packages/compass-collection/src/components/mock-data-generator-modal/utils.ts

+    } else if (typeof arg === 'boolean') {
+      // booleans are always valid, continue
+    } else if (Array.isArray(arg)) {
+      if (!areFakerArgsValid(arg)) {


Should we limit the depth of this to prevent potential infinite recursion? We could pass a depth param and limit to say, 3 levels

kpamaran · 2025-09-23T20:54:16Z

packages/compass-collection/src/components/mock-data-generator-modal/utils.ts

+      callable();
+      return { isValid: true, fakerArgs: [] };
+    }
+  } catch {


A warning when there's an exception or the args are invalid would be valuable for fine tuning the prompt

...mpass-collection/src/components/mock-data-generator-modal/mock-data-generator-modal.spec.tsx

packages/compass-collection/src/components/mock-data-generator-modal/faker-mapping-selector.tsx

kpamaran · 2025-09-23T21:27:59Z

The faker arg form experience itself is interesting, I wonder how well users can intuit that the args are positional. Could be a follow-up. Maybe the user can be shown an inline preview of the faker call (cc @jcobis), right now they'd need to make the connection by reading the Script output

ncarbon · 2025-09-23T21:33:17Z

The faker arg form experience itself is interesting, I wonder how well users can intuit that the args are positional. Could be a follow-up. Maybe the user can be shown an inline preview of the faker call (cc @jcobis), right now they'd need to make the connection by reading the Script output

Yeah, good question. If we can't show a specific label for each faker argument, an inline preview of the faker call would be good UI to show.

jcobis · 2025-09-24T14:38:57Z

The faker arg form experience itself is interesting, I wonder how well users can intuit that the args are positional. Could be a follow-up. Maybe the user can be shown an inline preview of the faker call (cc @jcobis), right now they'd need to make the connection by reading the Script output

Yeah, good question. If we can't show a specific label for each faker argument, an inline preview of the faker call would be good UI to show.

What if: if there are args, we just display the whole sample function call with args in a readonly text input? This is what I had in mind. Thoughts? @ncarbon @kpamaran

…on call display

kpamaran · 2025-09-25T16:03:05Z

What if: if there are args, we just display the whole sample function call with args in a readonly text input? This is what I had in mind. Thoughts?

I have confidence the sample call should be displayed. I'm not sure if it should be in an input, because the preview is not part of the form data (non-semantic usage). It could be some inline text above the field set. @jcobis @ncarbon

ncarbon · 2025-09-25T16:06:01Z

packages/compass-collection/src/components/mock-data-generator-modal/utils.ts

+const MAX_FAKER_ARGS_LENGTH = 10;
+const MAX_FAKER_STRING_LENGTH = 1000;
+const MAX_FAKER_ARGS_DEPTH = 3;
+const MAX_FAKER_NUMBER_SIZE = 1000;


Thoughts on these values? @jcobis

I feel like we could bump MAX_FAKER_NUMBER_SIZE to 10000 at least

…gs before call, update ui label

kpamaran · 2025-09-26T14:59:42Z

packages/compass-collection/src/components/mock-data-generator-modal/faker-mapping-selector.tsx

  fontWeight: 600,
 });

+const parseFakerArg = (arg: FakerArg): string => {


nit: there's a parseFakerArgs with a different behavior. One should be renamed

I can't find another function you mentioned, but the name still doesn't make too much sense: it seems to be the opposite of parse, takes the arguments and serializes them to string

Renaming to stringifyFakerArg

…ling

gribnoysup · 2025-10-06T08:19:21Z

packages/compass-collection/src/components/mock-data-generator-modal/faker-mapping-selector.tsx

  fontWeight: 600,
 });

+const parseFakerArg = (arg: FakerArg): string => {


I can't find another function you mentioned, but the name still doesn't make too much sense: it seems to be the opposite of parse, takes the arguments and serializes them to string

gribnoysup · 2025-10-06T08:23:09Z

packages/compass-collection/src/components/mock-data-generator-modal/faker-mapping-selector.tsx

+  if (typeof arg === 'object' && arg !== null && 'json' in arg) {
+    try {
+      return JSON.stringify(JSON.parse(arg.json));
+    } catch {
+      return '';
+    }
+  }
+  return arg.toString();


Based on the type definition, doesn't look like the array case is being handled well at all. If arg is FakerArg[] (which is part of the FakerArg type, it recursively references itself), you will just call prototype toString on it. Is this a mistake in how types are defined or in the method logic?

Thank you, going to handle arrays recursively

gribnoysup · 2025-10-06T08:35:02Z

packages/compass-collection/src/components/mock-data-generator-modal/utils.ts

+function isAllowedHelper(moduleName: string, methodName: string) {
+  return (
+    moduleName !== 'helpers' ||
+    methodName === 'arrayElement' ||
+    methodName === 'arrayElements'
+  );
+}


Hard to understand the intent here reading this expression, also function name doesn't match what's going on here, we're not validating only the helpers, but all method names. I would suggest to restructure it to be more explicit about what we're actually checking:

Suggested change

function isAllowedHelper(moduleName: string, methodName: string) {

return (

moduleName !== 'helpers' ||

methodName === 'arrayElement' ||

methodName === 'arrayElements'

);

}

function isAllowedFakerFn(moduleName: string, methodName: string) {

if (moduleName !== 'helpers') {

return true // all non-helper modules are allowed unconditionally

}

return methodName === 'arrayElement' || methodName === 'arrayElements' // only array helpers are allowed

}

Refactored for readability

gribnoysup · 2025-10-06T08:38:14Z

packages/compass-collection/src/components/mock-data-generator-modal/utils.ts

+    if (typeof arg === 'object' && arg !== null && 'json' in arg) {
+      return JSON.parse((arg as { json: string }).json);
+    }
+    return arg;


FakerArg is recursively an array of the same type and this is not handled. Is this type actually defined correctly? There is a lot of code here now that uses FakerArg[], but FakerArg type already includes an array, so either marking it as an array is redundant, or the type is wrong and it's always supporsed to be a flat array of arguments

The recursive type definition is to support nested arrays.

gribnoysup · 2025-10-06T12:13:40Z

packages/compass-collection/src/components/mock-data-generator-modal/utils.ts

+import type { FakerArg } from './script-generation-utils';
+import { faker } from '@faker-js/faker/locale/en';
+
+const MAX_FAKER_ARGS_LENGTH = 10;


Looking through the docs I can't see a case where it's ever more than 1 argument actually, any reason not to limit this more aggresively?

I see some examples of 2 arguments, such as fromCharacters and arrayElements, but none more than that. Will reduce this to 2.

Ah, I see the issue, this is just an imprecise variable name. Should be MAX_ARRAY_LENGTH, will rename. I can also add validation to ensure top-level args are limited to 2 at most.

gribnoysup · 2025-10-06T12:18:38Z

packages/compass-collection/src/components/mock-data-generator-modal/utils.ts

+      typeof (arg as { json?: unknown }).json === 'string'
+    ) {
+      try {
+        const parsedJson = JSON.parse((arg as { json: string }).json);


You can use an in check here which will also make it so that typescript know that property exists, so that you don't need to do extra assertions later:

Suggested change

typeof (arg as { json?: unknown }).json === 'string'

) {

try {

const parsedJson = JSON.parse((arg as { json: string }).json);

'json' in arg && typeof arg.json === 'string'

) {

try {

const parsedJson = JSON.parse(arg.json);

gribnoysup · 2025-10-06T12:26:47Z

packages/compass-collection/src/components/mock-data-generator-modal/utils.ts

+function prepareFakerArgs(args: FakerArg[]) {
+  return args.map((arg) => {
+    if (typeof arg === 'object' && arg !== null && 'json' in arg) {
+      return JSON.parse((arg as { json: string }).json);


Redundant assertion

Suggested change

return JSON.parse((arg as { json: string }).json);

return JSON.parse(arg.json);

feat(compass-collection): CLOUDP-347048 LLM Output Validation Followu…

d9f6a23

…p - Mock Data Generator

ncarbon added the no release notes Fix or feature not for release notes label Sep 23, 2025

github-actions bot added the feat label Sep 23, 2025

ncarbon commented Sep 23, 2025

View reviewed changes

...mpass-collection/src/components/mock-data-generator-modal/mock-data-generator-modal.spec.tsx Outdated Show resolved Hide resolved

ncarbon marked this pull request as ready for review September 23, 2025 17:15

ncarbon requested a review from a team as a code owner September 23, 2025 17:15

ncarbon requested review from mabaasit and Copilot September 23, 2025 17:15

Copilot AI reviewed Sep 23, 2025

View reviewed changes

codeowners-service-app bot requested a review from gagik September 23, 2025 17:20

call faker method without args if invalid

406aa2b

jcobis reviewed Sep 23, 2025

View reviewed changes

packages/compass-collection/src/components/mock-data-generator-modal/utils.ts Outdated Show resolved Hide resolved

jcobis reviewed Sep 23, 2025

View reviewed changes

kpamaran reviewed Sep 23, 2025

View reviewed changes

...mpass-collection/src/components/mock-data-generator-modal/mock-data-generator-modal.spec.tsx Show resolved Hide resolved

update faker argument handling and improve validation logging

0b079fb

kpamaran reviewed Sep 23, 2025

View reviewed changes

packages/compass-collection/src/components/mock-data-generator-modal/faker-mapping-selector.tsx Outdated Show resolved Hide resolved

kpamaran reviewed Sep 23, 2025

View reviewed changes

packages/compass-collection/src/components/mock-data-generator-modal/faker-mapping-selector.tsx Outdated Show resolved Hide resolved

kpamaran reviewed Sep 23, 2025

View reviewed changes

packages/compass-collection/src/components/mock-data-generator-modal/faker-mapping-selector.tsx Outdated Show resolved Hide resolved

resolve TypeScript error by casting initialState to any

df49911

add faker argument validation for large numbers and add sample functi…

b471586

…on call display

ncarbon changed the title ~~feat(compass-collection): CLOUDP-347048 LLM Output Validation Followup - Mock Data Generator~~ feat(compass-collection): Output Validation Followup - Mock Data Generator CLOUDP-347048 LLM Sep 25, 2025

ncarbon commented Sep 25, 2025

View reviewed changes

jcobis approved these changes Sep 25, 2025

View reviewed changes

increase max number size, allow faker.helpers.arrayElements, parse ar…

06258a8

…gs before call, update ui label

ncarbon changed the title ~~feat(compass-collection): Output Validation Followup - Mock Data Generator CLOUDP-347048 LLM~~ feat(compass-collection): Output Validation Followup - Mock Data Generator LLM CLOUDP-347048 Sep 25, 2025

ncarbon requested a review from kpamaran September 25, 2025 20:57

add/fix faker call preview test

84ee0cf

kpamaran reviewed Sep 26, 2025

View reviewed changes

kpamaran approved these changes Sep 26, 2025

View reviewed changes

replace parseFakerArgs with prepareFakerArgs for better argument hand…

a8b814b

…ling

ncarbon changed the title ~~feat(compass-collection): Output Validation Followup - Mock Data Generator LLM CLOUDP-347048~~ feat(compass-collection): Output Validation Followup - Mock Data Generator LLM CLOUDP-347048 Sep 26, 2025

ncarbon requested a review from gribnoysup September 29, 2025 16:17

gribnoysup reviewed Oct 6, 2025

View reviewed changes

jcobis added 4 commits October 6, 2025 12:14

Address comment

e9087d4

Refactor util

806f5c3

Address comment

e2dc601

Comments

4f65350

jcobis requested a review from gribnoysup October 6, 2025 16:55

Merge branch 'main' into CLOUDP-347048/faker-args-validation

340234a

	return JSON.parse((arg as { json: string }).json);
	return JSON.parse(arg.json);

feat(compass-collection): Output Validation Followup - Mock Data Generator LLM CLOUDP-347048 #7365

Are you sure you want to change the base?

feat(compass-collection): Output Validation Followup - Mock Data Generator LLM CLOUDP-347048 #7365

Conversation

ncarbon commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Motivation and Context

Open Questions

Dependents

Types of changes

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codeowners-service-app bot commented Sep 23, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jcobis Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kpamaran Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kpamaran commented Sep 23, 2025

Uh oh!

ncarbon commented Sep 23, 2025

Uh oh!

jcobis commented Sep 24, 2025

Uh oh!

kpamaran commented Sep 25, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jcobis Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jcobis Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gribnoysup Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ncarbon commented Sep 23, 2025 •

edited

Loading

jcobis Sep 23, 2025 •

edited

Loading

kpamaran Sep 23, 2025 •

edited

Loading

jcobis Sep 25, 2025 •

edited

Loading

jcobis Oct 6, 2025 •

edited

Loading

gribnoysup Oct 6, 2025 •

edited

Loading