Skip to content

WiP Redesign the encoding of Longs in JavaScript. #5205

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

sjrd
Copy link
Member

@sjrd sjrd commented Jun 25, 2025

Overall, we make the flat representation of longs as a pair of (lo, hi) the representation, at the Emitter level. There are no more instances of RuntimeLong. The emitter flattens out all the Longs as follows:

  • A local variable of type long becomes two local variables of type int.
  • A field of type long becomes two fields of type int.
  • An Array[Long] is stored as an Int32Array with twice as many elements, alternating lo and hi words.
  • Method parameters of type long are expanded as two parameters of type int.

For the result of method parameters, there is a trick. That one is the most "debatable", in that I think there are contending alternatives that may be faster. When a method returns a Long, it stores the hi word in a global variable $resHi, then returns the lo word. At call site, we read back the $resHi global. We used a similar trick for non-inlined methods of RuntimeLong, with a var resultHi field of object RuntimeLong. Now this is moved to the emitter to take care of.

All the methods of RuntimeLong explicitly take "expanded" versions of their parameters: abs takes two parameters of type Int; add takes 4. Shifts take 3 parameters of type Int: the lo, the hi, and the shift amount. The result, however, is a Long.

In order to allow them to construct Long results from their words, we introduce a magic method RuntimeLong.pack(lo, hi), whose body is filled in by the Desugarer with a special Transient(PackLong(lo, hi)). It cannot be the compiler, because we cannot serialize transients. And it cannot wait for the emitter, because the optimizer definitely wants to see the PackLongs to unpack them. An alternative would be to introduce a new BinaryOp, but I think that's worse because it bakes an implementation detail of the emitter into the IR. The fact that we can even do this PR is a testament to the current abstraction level of our IR.

These changes significantly speed up even the SHA512 benchmark, even though it performs most computations on stack already anyway (the improvements must come from the arrays, in this case). I haven't measured benchmarks that extensively use Long fields yet, but I expect them to get dramatic speedups.

For the optimizer, this makes things a lot simpler. Instead of having special-cases for RuntimeLong everywhere, we basically have a unique withSplitLong method to deal with them. That method can split one PreTransform of type long into two PreTransforms of type ints, so that they can be given to the inlined methods of RuntimeLong. We introduce a new LongPairReplacement for LocalDefs that aggregate a pair of (lo, hi) LocalDefs (typically the result of a split PackLong). As a nice bonus, the IR checker now passes with RuntimeLong after the optimizer.


One big caveat for now: Closure breaks the new encoding, sometimes. I think it gets confused by the $resHi variable and evaluation order of function params. For example if we pass the result of a Long method to a Long parameter, we emit

x.foo(y.bar(1), $resHi);

During y.bar(1), the method modifies $resHi. It is then read right after to be passed as second argument to x.foo.

sjrd added 2 commits June 23, 2025 15:05
It turns out the computation we did in the non-negative case also
works for the negative case. The proof relies on elementary
properties of the two's complement representation.

I don't know how I never saw this before. To make things worse, it
seems that Kotlin and J2CL knew all along, and I never realized
when skimming through their implementations either.
Since conversions of signed longs to doubles is in fact no more
expensive than the unsigned longs, we can take shorter paths for
values that fit in the *signed* safe range.

This applies to the conversions to string (including in the
javalib) and to float.

It could also apply to signed division and remainder. However,
benchmarks suggest that doing so makes it slower. The trouble is
that we then need a signed double-to-long conversion for the
result, and that appears to be slower than performing the 3 sign
adjustments.
@sjrd sjrd force-pushed the rt-long-expanded branch 6 times, most recently from 7161430 to 2fe28e7 Compare June 28, 2025 11:09
@sjrd
Copy link
Member Author

sjrd commented Jun 28, 2025

@gzm0 This is not yet ready for a code review. However, it is a good time to get your opinion on the overall new compilation scheme and architectural changes. WDYT?

@sjrd sjrd force-pushed the rt-long-expanded branch from 2fe28e7 to f35c5dd Compare June 28, 2025 11:53
As a nice bonus, the IR checker now passes with `RuntimeLong`.
@sjrd sjrd force-pushed the rt-long-expanded branch from f35c5dd to 3e51123 Compare June 28, 2025 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant