Skip to content

Implement new FFI system as in the proposal#583

Merged
davazp merged 55 commits intomasterfrom
ffi-improvements
Feb 4, 2026
Merged

Implement new FFI system as in the proposal#583
davazp merged 55 commits intomasterfrom
ffi-improvements

Conversation

@davazp
Copy link
Member

@davazp davazp commented Jan 31, 2026

FFI Proposal

Done

  • #j"foo" reader macro for JS string literals
  • (jsstring x) builtin for converting Lisp strings to JS strings
  • Printer support: JS strings print as #j"...", raw with princ
  • Printer support for +true+, +false+, +null+, +undefined+
  • Rename new → object, make-new → new, remove object!
  • Remove lisp_to_js conversion from object
  • Ensure Lisp strings work for property access in oget/oset
  • Fix disassemble to work with lambda forms
  • Multiple values side-channel (internals._mv instead of values argument)
  • unwind-protect save/restore of _mv
  • Comprehensive multiple-values test suite
  • (clstring x) convert JS string to CL string
  • (jsbool x) convert t/nil to +true+/+false+
  • (clbool x) convert JS booleans to t/nil
  • Remove redundant oget*/oset* variants
  • Remove js_to_lisp conversion from oget (make it behave like oget*)
  • Remove lisp_to_js conversion from oset (make it behave like oset*)

Not included:

  • LispString representation for CL strings (wrapping JS strings instead of arrays)

function (values, ...args) {
}

Now CL functions take a general JS signature

fucntion (...args) {
   ...
}

but can write multiple values to a internals._mv variable. The
compiler has been adjusted to carefully set and reset this side
channel variable in multiple places.

mv contain all values, not just the secondaries

That way we can differentiate between (values) and (values nil)
The _mv side channel is cleared too early in compile-funcall. When a
function call is compiled with *multiple-value-p* = t, the compiler
generates:

  (internals._mv = null, func.fvalue(arg1, arg2, ...))

The _mv = null clearing happens before argument evaluation (due to JS
comma operator semantics - the left side executes first, then the
arguments are evaluated as part of the function call expression on the
right). But if any argument contains a sub-expression that calls a
function returning multiple values (like intern, which returns (values
symbol was-present-p)), that sub-expression will set _mv to a non-null
value after the clearing but before the outer function call executes.

When the outer multiple-value-call then checks _mv, it finds the stale
values from the inner intern call instead of the actual multiple
values from the outer call. This causes it to use the wrong values.

In the defstruct case:

1. (setf (dsd-constructors dd) (or constructors (list (make-dsd-constructor :name (%symbolize "MAKE-" (dsd-name dd)) :boa :default))))
2. The setf is wrapped in a multiple-value-call by the compiler
3. Inside the argument evaluation, %symbolize calls intern which sets _mv = [MAKE-ES, NIL]
4. The outer multiple-value-call picks up this stale _mv value
5. Instead of passing the correct list (#S(dsd-constructor ...)) to the setf function, it passes MAKE-ES
6. So (dsd-constructors dd) becomes the symbol MAKE-ES instead of a list of constructor structs
7. Later, mapcar over this "list" tries caar on MAKE-ES, producing the TYPE-ERROR

The fix would need to ensure _mv is cleared after all arguments are
evaluated but before the function body begins executing, or the code
generation needs to evaluate arguments into temporaries first and
clear _mv after.
@github-actions
Copy link

Deploy preview ready!
https://deploy-preview-583--jscl-preview.netlify.app

@davazp davazp added this to the FFI System milestone Jan 31, 2026
Initialize the symbols earlier as uninterned symbols, and intern them
into the CL package later.
@kchanqvq
Copy link
Member

kchanqvq commented Feb 1, 2026

I don't think +true+ etc should be printed as +true+. This will print both '+true+ and +true+ as +true+, but the former is a Lisp symbol, the latter is a JS primitive value.

@davazp
Copy link
Member Author

davazp commented Feb 1, 2026

I don't think +true+ etc should be printed as +true+. This will print both '+true+ and +true+ as +true+, but the former is a Lisp symbol, the latter is a JS primitive value.
Good point. Hm, any other ideas of how we could make it readable? Read it /print it as #j+true+ or similar perhaps?

It would be very convenient if any js object is printed in a way that would be readable too.

@davazp
Copy link
Member Author

davazp commented Feb 1, 2026

The ergonomics are not great yet:

["foo", "bar"].map(x => x.toUpperCase())

can be done like

  (let ((arr (vector #j"foo" #j"bar")))                                                                                                                                                     
      ((oget arr "map")                                                                                                                                                                     
       (lambda (x) ((oget x "toUpperCase")))))

A syntax for calling methods would be nice

  (let ((arr (vector #j"foo" #j"bar")))                                                                                                                                                     
      (jsmethod arr "map"
        (lambda (x) (jsmethod x "toUpperCase"))))

would be even nicer if #j allowed for dynamic root more easily, then

  (let ((arr (vector #j"foo" #j"bar")))                                                                                                                                                     
      (#j:(arr):map (lambda (x) (#j:(x):toUpperCase))))

@kchanqvq
Copy link
Member

kchanqvq commented Feb 1, 2026

I think the current oget is fine. I don't like the idea of adding too many/involved/complicated custom syntax. This include the #{...} mentioned elsewhere, I think we should leave this to the user to define it if they find this useful. And they would be responsible for using something like named-readtable to manage collision (although we don't have read table yet, but we will).

@kchanqvq
Copy link
Member

kchanqvq commented Feb 1, 2026

Good point. Hm, any other ideas of how we could make it readable? Read it /print it as #j+true+ or similar perhaps?

It would be very convenient if any js object is printed in a way that would be readable too.

A common solution found in the wild is #.+true+. Do we really want to make this readable? Note e.g. in SBCL, something like NaN is not printed readably.

Copy link
Member

@kchanqvq kchanqvq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kudos for this ambitious undertaking!

Besides the inline review/discussion, two high-level things:

  1. I think a lot of complexity is added to emulate JS values in the host because we use #j:... literal during bootstrap.
  • How strongly do you want to use DEFSTRUCTs for the emulation? We could use some magic symbol markers, which will remove the need for MAKE-LOAD-FORM. But you could say this is one step backward.
  • The option I like better is to not use #j:... literal in the host at all. I think this is possible: just write (jsstring "cl-string"), and our compiler is smart enough the elide the conversion. If this works I think this can massively simplify. We're currently only emulating these literals as opaque datum anyway and I think it's unlikely we really need them.
  1. I think representing NIL with undefined will simplify lots of things and is more ergonomic. But this probably goes into another PR. Let's discuss!

(output-arg (cadr args))
(dolist (operand (cddr args))
(js-format ",")
(output-arg operand))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this loop can be written cleaner with a first-arg-p flag. Something like
(let ((first-arg-p t)) (dolist ... (when first-arg-p ...output-comma... (setq first-arg-p nil)) ...)). And there would be no need for an flet (and cddr etc to peek into later part of the list).

@davazp
Copy link
Member Author

davazp commented Feb 3, 2026

I think a lot of complexity is added to emulate JS values in the host because we use #j:... literal during bootstrap.
How strongly do you want to use DEFSTRUCTs for the emulation? We could use some magic symbol markers, which will remove the need for MAKE-LOAD-FORM. But you could say this is one step backward.

The option I like better is to not use #j:... literal in the host at all. I think this is possible: just write (jsstring "cl-string"), and our compiler is smart enough the elide the conversion. If this works I think this can massively simplify. We're currently only emulating these literals as opaque datum anyway and I think it's unlikely we really need them.

I had this before. But the thing is now the reader does not return (jsstring "foo"), it does return a raw JS string directly. We'd need some extra conditionals in reader and compiler.

We could make the reader return logic conditional on being the host or not. But even if now they are only tokens, it'd be useful to dispatch based on js types in the host. So I think we'd end up replicating defstructs on top of s-expressions.

We might use defstruct lists to remove the make-load-form to this also? we have to be careful with consp when implementing the compatibility typeof but that's okay I guess.

@davazp
Copy link
Member Author

davazp commented Feb 3, 2026

I think representing NIL with undefined will simplify lots of things and is more ergonomic. But this probably goes into another PR. Let's discuss!

yeah. undefined is a bit prefer for NIL over null and false now I think. oget and oget? would return nil.

But undefined can't do the nice jscl_car trick we did to make (car nil) = nil!

I think we should still experiment with it, but I'd leave it for another branch.

@kchanqvq
Copy link
Member

kchanqvq commented Feb 3, 2026

I had this before. But the thing is now the reader does not return (jsstring "foo"), it does return a raw JS string directly.

IIUC my proposal is different. My proposal is never use #j"literal" in any code that is read in the host. This way, no JS value literal will ever get into the source forms processed by the host, thus no emulation/dumping etc is required at the host stage.

I speculate this is possible, and wherever we use #j"literal" in the host right now can probably be rewritten to use (jsstring "..."), thus embedding no JS primitive value literal in the code (processed by the host).

@kchanqvq
Copy link
Member

kchanqvq commented Feb 3, 2026

But undefined can't do the nice jscl_car trick we did to make (car nil) = nil

We could also use ?.$$jscl_car for this. I don't know will this bring more overhead/are JS engines specifically optimized for ?.. I'll experiment this in another branch!

@davazp
Copy link
Member Author

davazp commented Feb 3, 2026

IIUC my proposal is different. My proposal is never use #j"literal" in any code that is read in the host. This way, no JS value literal will ever get into the source forms processed by the host, thus no emulation/dumping etc is required at the host stage.

I speculate this is possible, and wherever we use #j"literal" in the host right now can probably be rewritten to use (jsstring "..."), thus embedding no JS primitive value literal in the code (processed by the host).

I see. That could work. So to make sure I understand:

  • Provide jsstring and jsbool as primitives
  • Compiler does both:
    • (eq (typeof X) (jsstring "string")) (for target mostly)
    • jsstring primitive (for host mostly)

So we do not dump structs.

I'll still have to keep the structs. And make jsstring a constructor for it. But at least we remove the make-load-form.

@davazp
Copy link
Member Author

davazp commented Feb 3, 2026

I'm happy with the state of the PR now.

So if not objections, I'd leave extra improvements for future PRs.

@kchanqvq
Copy link
Member

kchanqvq commented Feb 3, 2026

I'll still have to keep the structs.

I think we can get rid of any emulation on host stage, but I'll have to experiment, only way to know!

Otherwise LGTM! There're some cosmetic simplification on the compiler code that I wanted (like the loop for generating call), but we can leave that for later.

@davazp
Copy link
Member Author

davazp commented Feb 4, 2026

Merging! 🎉 Thank you @kchanqvq and @vlad-km for the help.

We continue with next iteration if needed on #586.

@davazp davazp merged commit dbbb15e into master Feb 4, 2026
1 check passed
@davazp davazp deleted the ffi-improvements branch February 4, 2026 09:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants