Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
d4136b5
Cache intermediate segments allocated during FFM stub invocations.
mernst-github Jan 15, 2025
93280da
readability
mernst-github Jan 16, 2025
6e6bcfb
Merge branch 'master' into mernst/cache-segments
mernst-github Jan 16, 2025
e27798d
merge master
mernst-github Jan 16, 2025
f96d963
readability
mernst-github Jan 16, 2025
2bda29a
avoid TL lookup if not necessary
mernst-github Jan 16, 2025
61f35c9
!!@# format
mernst-github Jan 16, 2025
9cf9837
final
mernst-github Jan 16, 2025
2964f84
final
mernst-github Jan 16, 2025
f2cd144
add pinned sections around CTL manipulation
mernst-github Jan 16, 2025
a0ac383
feedback:
mernst-github Jan 16, 2025
0a41dce
minimal continuation pinning
mernst-github Jan 17, 2025
68d4bcc
cache the bounded area/slicing allocator
mernst-github Jan 17, 2025
b35cc86
confine buffers.
mernst-github Jan 17, 2025
fd9e791
Merge branch 'master' into mernst/cache-segments
mernst-github Jan 17, 2025
c0b2beb
no need to use SlicingAllocator directly
mernst-github Jan 17, 2025
edaa0a0
Merge remote-tracking branch 'origin/mernst/cache-segments' into mern…
mernst-github Jan 17, 2025
021d037
revert SlicingAllocator
mernst-github Jan 17, 2025
634b909
reorder
mernst-github Jan 17, 2025
195f68a
move scoping
mernst-github Jan 17, 2025
1f2110a
move pinned cache lookup out of constructor.
mernst-github Jan 17, 2025
09e9c9d
Benchmark:
mernst-github Jan 18, 2025
46bf342
copyright header
mernst-github Jan 18, 2025
4940f39
Add comparison benchmark for out-parameter.
mernst-github Jan 18, 2025
5b750a3
shave off a couple more nanos
mernst-github Jan 19, 2025
001c785
move CallBufferCache out
mernst-github Jan 19, 2025
d9a49c6
unit test
mernst-github Jan 19, 2025
873ffa6
(c)
mernst-github Jan 19, 2025
343909b
Storing segment addresses instead of objects in the cache appears to …
mernst-github Jan 19, 2025
4a2210d
tiny stylistic changes
mernst-github Jan 19, 2025
35a3a15
Implementation notes.
mernst-github Jan 20, 2025
b7be3a6
revert formatting
mernst-github Jan 20, 2025
643efd7
move bench
mernst-github Jan 20, 2025
4f8a9a9
shift api boundary
mernst-github Jan 20, 2025
0023eb4
reduce visibility
mernst-github Jan 20, 2025
f68a930
remove stray -Xlog:gc
mernst-github Jan 20, 2025
a523278
whitespace :scream:
mernst-github Jan 20, 2025
5a8491f
restore 3 forks
mernst-github Jan 20, 2025
d408852
Back buffer allocation with a single carrier-local segment.
mernst-github Jan 22, 2025
ad0b928
--unnecessary annotations
mernst-github Jan 22, 2025
d347a87
* use slicing allocator for alignment guarantees
mernst-github Jan 22, 2025
b0c2af1
(c)
mernst-github Jan 22, 2025
954a685
more test
mernst-github Jan 22, 2025
686132b
an attempt at a stress test
mernst-github Jan 22, 2025
13dfec9
(c)
mernst-github Jan 22, 2025
93beb68
(c)
mernst-github Jan 23, 2025
f09a29d
Apply suggestions from code review
mernst-github Jan 23, 2025
6dbda1c
topOfStack
mernst-github Jan 23, 2025
4664d36
Merge remote-tracking branch 'origin/mernst/cache-segments' into mern…
mernst-github Jan 23, 2025
0e6d532
test deep linker stack
mernst-github Jan 23, 2025
8947964
/othervm --enable-native-access=ALL-UNNAMED
mernst-github Jan 23, 2025
c314d6a
fix test under VThread factory
mernst-github Jan 23, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2021, 2023, Oracle and/or its affiliates. All rights reserved.
* Copyright (c) 2021, 2025, Oracle and/or its affiliates. All rights reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
* This code is free software; you can redistribute it and/or modify it
Expand Down Expand Up @@ -38,6 +38,22 @@ public SlicingAllocator(MemorySegment segment) {
this.segment = segment;
}

public long currentOffset() {
return sp;
}

public void resetTo(long offset) {
if (offset < 0 || offset > sp)
throw new IllegalArgumentException(String.format("offset %d should be in [0, %d] ", offset, sp));
this.sp = offset;
}

public boolean canAllocate(long byteSize, long byteAlignment) {
long min = segment.address();
long start = Utils.alignUp(min + sp, byteAlignment) - min;
return start + byteSize <= segment.byteSize();
}

MemorySegment trySlice(long byteSize, long byteAlignment) {
long min = segment.address();
long start = Utils.alignUp(min + sp, byteAlignment) - min;
Expand Down
122 changes: 122 additions & 0 deletions src/java.base/share/classes/jdk/internal/foreign/abi/BufferStack.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
/*
* Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
* This code is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License version 2 only, as
* published by the Free Software Foundation. Oracle designates this
* particular file as subject to the "Classpath" exception as provided
* by Oracle in the LICENSE file that accompanied this code.
*
* This code is distributed in the hope that it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
* version 2 for more details (a copy is included in the LICENSE file that
* accompanied this code).
*
* You should have received a copy of the GNU General Public License version
* 2 along with this work; if not, write to the Free Software Foundation,
* Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
*
* Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
* or visit www.oracle.com if you need additional information or have any
* questions.
*/
package jdk.internal.foreign.abi;

import jdk.internal.foreign.SlicingAllocator;
import jdk.internal.misc.CarrierThreadLocal;
import jdk.internal.vm.annotation.ForceInline;

import java.lang.foreign.Arena;
import java.lang.foreign.MemorySegment;
import java.lang.foreign.SegmentAllocator;
import java.util.concurrent.locks.ReentrantLock;

public class BufferStack {
private final long size;

public BufferStack(long size) {
this.size = size;
}

private final ThreadLocal<PerThread> tl = new CarrierThreadLocal<>() {
@Override
protected PerThread initialValue() {
return new PerThread(size);
}
};

@ForceInline
public Arena pushFrame(long size, long byteAlignment) {
return tl.get().pushFrame(size, byteAlignment);
}

private static final class PerThread {
private final ReentrantLock lock = new ReentrantLock();
private final SlicingAllocator stack;

public PerThread(long size) {
this.stack = new SlicingAllocator(Arena.ofAuto().allocate(size));
}

@ForceInline
public Arena pushFrame(long size, long byteAlignment) {
boolean needsLock = Thread.currentThread().isVirtual() && !lock.isHeldByCurrentThread();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@minborg please check this -- you have discovered some cases where isVirtual is not enough (e.g. because virtual threads use carrier in the common pool, which can also be used for non-virtual thread stuff)

if (needsLock && !lock.tryLock()) {
// Rare: another virtual thread on the same carrier competed for acquisition.
return Arena.ofConfined();
}
if (!stack.canAllocate(size, byteAlignment)) {
if (needsLock) lock.unlock();
return Arena.ofConfined();
}

return new Frame(needsLock, size, byteAlignment);
}

private class Frame implements Arena {
private final boolean locked;
private final long parentOffset;
private final long topOfStack;
private final Arena scope = Arena.ofConfined();
private final SegmentAllocator frame;

@SuppressWarnings("restricted")
public Frame(boolean locked, long byteSize, long byteAlignment) {
this.locked = locked;

parentOffset = stack.currentOffset();
MemorySegment frameSegment = stack.allocate(byteSize, byteAlignment);
topOfStack = stack.currentOffset();
frame = new SlicingAllocator(frameSegment.reinterpret(scope, null));
}

private void assertOrder() {
if (topOfStack != stack.currentOffset())
throw new IllegalStateException("Out of order access: frame not top-of-stack");
}

@Override
@SuppressWarnings("restricted")
public MemorySegment allocate(long byteSize, long byteAlignment) {
return frame.allocate(byteSize, byteAlignment);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also check order?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, in the sense that an allocation in a lower stack frame seems suspicious, but technically it is completely legal. The frame has been allocated and is sliced to the requested size, and is guaranteed as long as the Frame's arena hasn't been closed, no matter whether other frames are on top:

frame1 = pushFrame(256);
frame2 = pushFrame(256);
<<frame1 can safely allocate up to 256>>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, in the sense that an allocation in a lower stack frame seems suspicious, but technically it is completely legal. The frame has been allocated and is sliced to the requested size, and is guaranteed as long as the Frame's arena hasn't been closed, no matter whether other frames are on top:

frame1 = pushFrame(256);
frame2 = pushFrame(256);
<<frame1 can safely allocate up to 256>>

Ok, in this particular design it's ok because you allocate up front. So each frame allocate in its own space.

}

@Override
public MemorySegment.Scope scope() {
return scope.scope();
}

@Override
public void close() {
assertOrder();
scope.close();
stack.resetTo(parentOffset);
if (locked) {
lock.unlock();
}
}
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -382,26 +382,12 @@ static long pickChunkOffset(long chunkOffset, long byteWidth, int chunkWidth) {
: chunkOffset;
}

public static Arena newBoundedArena(long size) {
return new Arena() {
final Arena arena = Arena.ofConfined();
final SegmentAllocator slicingAllocator = SegmentAllocator.slicingAllocator(arena.allocate(size));

@Override
public Scope scope() {
return arena.scope();
}
private static final int LINKER_STACK_SIZE = Integer.getInteger("jdk.internal.foreign.LINKER_STACK_SIZE", 256);
private static final BufferStack LINKER_STACK = new BufferStack(LINKER_STACK_SIZE);

@Override
public void close() {
arena.close();
}

@Override
public MemorySegment allocate(long byteSize, long byteAlignment) {
return slicingAllocator.allocate(byteSize, byteAlignment);
}
};
@ForceInline
public static Arena newBoundedArena(long size) {
return LINKER_STACK.pushFrame(size, 8);
}

public static Arena newEmptyArena() {
Expand Down
157 changes: 157 additions & 0 deletions test/jdk/java/foreign/TestBufferStack.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
/*
* Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
* This code is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License version 2 only, as
* published by the Free Software Foundation.
*
* This code is distributed in the hope that it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
* version 2 for more details (a copy is included in the LICENSE file that
* accompanied this code).
*
* You should have received a copy of the GNU General Public License version
* 2 along with this work; if not, write to the Free Software Foundation,
* Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
*
* Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
* or visit www.oracle.com if you need additional information or have any
* questions.
*/

/*
* @test
* @modules java.base/jdk.internal.foreign.abi
* @build NativeTestHelper TestBufferStack
* @run testng/othervm --enable-native-access=ALL-UNNAMED TestBufferStack
*/

import jdk.internal.foreign.abi.BufferStack;
import org.testng.Assert;
import org.testng.annotations.Test;

import java.lang.foreign.Arena;
import java.lang.foreign.FunctionDescriptor;
import java.lang.foreign.MemoryLayout;
import java.lang.foreign.MemorySegment;
import java.lang.foreign.SegmentAllocator;
import java.lang.invoke.MethodHandle;
import java.time.Duration;
import java.util.Arrays;
import java.util.stream.IntStream;

import static java.lang.foreign.MemoryLayout.structLayout;
import static java.lang.foreign.ValueLayout.*;
import static java.time.temporal.ChronoUnit.SECONDS;

public class TestBufferStack extends NativeTestHelper {
@Test
public void testScopedAllocation() {
int stackSize = 128;
BufferStack stack = new BufferStack(stackSize);
MemorySegment stackSegment;
try (Arena frame1 = stack.pushFrame(3 * JAVA_INT.byteSize(), JAVA_INT.byteAlignment())) {
// Segments have expected sizes and are accessible and allocated consecutively in the same scope.
MemorySegment segment11 = frame1.allocate(JAVA_INT);
Assert.assertEquals(segment11.scope(), frame1.scope());
Assert.assertEquals(segment11.byteSize(), JAVA_INT.byteSize());
segment11.set(JAVA_INT, 0, 1);
stackSegment = segment11.reinterpret(stackSize);

MemorySegment segment12 = frame1.allocate(JAVA_INT);
Assert.assertEquals(segment12.address(), segment11.address() + JAVA_INT.byteSize());
Assert.assertEquals(segment12.byteSize(), JAVA_INT.byteSize());
Assert.assertEquals(segment12.scope(), frame1.scope());
segment12.set(JAVA_INT, 0, 1);

MemorySegment segment2;
try (Arena frame2 = stack.pushFrame(JAVA_LONG.byteSize(), JAVA_LONG.byteAlignment())) {
Assert.assertNotEquals(frame2.scope(), frame1.scope());
// same here, but a new scope.
segment2 = frame2.allocate(JAVA_LONG);
Assert.assertEquals(segment2.address(), segment12.address() + /*segment12 size + frame 1 spare + alignment constraint*/ 3 * JAVA_INT.byteSize());
Assert.assertEquals(segment2.byteSize(), JAVA_LONG.byteSize());
Assert.assertEquals(segment2.scope(), frame2.scope());
segment2.set(JAVA_LONG, 0, 1);

// Frames must be closed in stack order.
Assert.assertThrows(IllegalStateException.class, frame1::close);
}
// Scope is closed here, inner segments throw.
Assert.assertThrows(IllegalStateException.class, () -> segment2.get(JAVA_INT, 0));
// A new stack frame allocates at the same location (but different scope) as the previous did.
try (Arena frame3 = stack.pushFrame(2 * JAVA_INT.byteSize(), JAVA_INT.byteAlignment())) {
MemorySegment segment3 = frame3.allocate(JAVA_INT);
Assert.assertEquals(segment3.scope(), frame3.scope());
Assert.assertEquals(segment3.address(), segment12.address() + 2 * JAVA_INT.byteSize());
}

// Fallback arena behaves like regular stack frame.
MemorySegment outOfStack;
try (Arena hugeFrame = stack.pushFrame(1024, 4)) {
outOfStack = hugeFrame.allocate(4);
Assert.assertEquals(outOfStack.scope(), hugeFrame.scope());
Assert.assertTrue(outOfStack.asOverlappingSlice(stackSegment).isEmpty());
}
Assert.assertThrows(IllegalStateException.class, () -> outOfStack.get(JAVA_INT, 0));

// Outer segments are still accessible.
segment11.get(JAVA_INT, 0);
segment12.get(JAVA_INT, 0);
}
}

@Test
public void stress() throws InterruptedException {
BufferStack stack = new BufferStack(256);
Thread[] vThreads = IntStream.range(0, 1024).mapToObj(_ ->
Thread.ofVirtual().start(() -> {
long threadId = Thread.currentThread().threadId();
while (!Thread.interrupted()) {
for (int i = 0; i < 1_000_000; i++) {
try (Arena arena = stack.pushFrame(JAVA_LONG.byteSize(), JAVA_LONG.byteAlignment())) {
// Try to assert no two vThreads get allocated the same stack space.
MemorySegment segment = arena.allocate(JAVA_LONG);
JAVA_LONG.varHandle().setVolatile(segment, 0L, threadId);
Assert.assertEquals(threadId, (long) JAVA_LONG.varHandle().getVolatile(segment, 0L));
}
}
Thread.yield(); // make sure the driver thread gets a chance.
}
})).toArray(Thread[]::new);
Thread.sleep(Duration.of(10, SECONDS));
Arrays.stream(vThreads).forEach(
thread -> {
Assert.assertTrue(thread.isAlive());
thread.interrupt();
});
}

static {
System.loadLibrary("TestBufferStack");
}

private static final MemoryLayout HVAPoint3D = structLayout(NativeTestHelper.C_DOUBLE, C_DOUBLE, C_DOUBLE);
private static final MemorySegment UPCALL_MH = upcallStub(TestBufferStack.class, "recurse", FunctionDescriptor.of(HVAPoint3D, C_INT));
private static final MethodHandle DOWNCALL_MH = downcallHandle("recurse", FunctionDescriptor.of(HVAPoint3D, C_INT, ADDRESS));

public static MemorySegment recurse(int depth) {
try {
return (MemorySegment) DOWNCALL_MH.invokeExact((SegmentAllocator) Arena.ofAuto(), depth, UPCALL_MH);
} catch (Throwable e) {
throw new RuntimeException(e);
}
}

@Test
public void testDeepStack() throws Throwable {
// Each downcall and upcall require 48 bytes of stack.
// After five allocations we start falling back.
MemorySegment point = recurse(10);
Assert.assertEquals(point.getAtIndex(C_DOUBLE, 0), 12.0);
Assert.assertEquals(point.getAtIndex(C_DOUBLE, 1), 11.0);
Assert.assertEquals(point.getAtIndex(C_DOUBLE, 2), 10.0);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also have a test that exercises the new code in the context of the linker. We have some tests that test downcall -> upcall with by-value structs, but I think we should also have a test that keeps going until it exhausts the buffer. Something like test/jdk/java/foreign/stackwalk/TestReentrantUpcalls.java (but without the Whitebox stuff).

Copy link
Contributor Author

@mernst-github mernst-github Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Verified manually during test that we satisfy the first 5 frames from the stack and fall back afterwards.

39 changes: 39 additions & 0 deletions test/jdk/java/foreign/libTestBufferStack.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
/*
* Copyright (c) 2025, Oracle and/or its affiliates. All rights reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
* This code is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License version 2 only, as
* published by the Free Software Foundation.
*
* This code is distributed in the hope that it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
* version 2 for more details (a copy is included in the LICENSE file that
* accompanied this code).
*
* You should have received a copy of the GNU General Public License version
* 2 along with this work; if not, write to the Free Software Foundation,
* Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
*
* Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
* or visit www.oracle.com if you need additional information or have any
* questions.
*/

#include "export.h"

typedef struct { double x, y, z; } HVAPoint3D;

EXPORT HVAPoint3D recurse(int depth, HVAPoint3D (*cb)(int)) {
if (depth == 0) {
HVAPoint3D result = { 2, 1, 0};
return result;
}

HVAPoint3D result = cb(depth - 1);
result.x += 1;
result.y += 1;
result.z += 1;
return result;
}
Loading