Skip to content

Conversation

@zhehangd
Copy link
Contributor

@zhehangd zhehangd commented Aug 2, 2022

This is exactly the fix proposed in #660 by @kidrigger in order to fix #652. The original PR was closed with the code removed somehow, but I believe it is a proper fix as discussed below. Another probably related PR #662 also removed the original code (only test cases left) after the merge of godotengine/godot#57968, which however didn't solve all the problems.

To summarize, when a function news and returns a new ref counted object like the code below, the object is freed immediately after the function returns so on the GDScript side we only see a null object return.

godot::Ref<godot::RefCounted> create_ref_counted_object(void) {
  godot::Ref<godot::RefCounted> obj_r;
  obj_r.instantiate();
  return obj_r;
}

The reason is that PtrToArg only passes the raw pointer to the engine so the moment the Ref gets destroyed it frees the object.
https://github.com/zhehangd/godot-cpp/blob/master/include/godot_cpp/classes/ref.hpp#L249

template <class T>
struct PtrToArg<Ref<T>> {
	_FORCE_INLINE_ static void encode(Ref<T> p_val, const void *p_ptr) {
		*(void **)p_ptr = p_val->_owner;
	}
};

The solution proposed by #660 well fixed this issue.

	_FORCE_INLINE_ static void encode(Ref<T> p_val, const void *p_ptr) {
		if (p_val != nullptr) {
			p_val->reference();
			*(void**)p_ptr = p_val->_owner;
		} else {
			*(void**)p_ptr = nullptr;
		}
	}

@BastiaanOlij mentioned a concern that this may overwrite an existing pointer.

This will only work if the reference we're assigning is currently unset. If it is set that means we're just overwriting the current pointer and that object will become dangling.

The discussion happened several months ago and I don't the circumstance at that time, but for what I see in the current codebase this is not an issue.
PtrToArg::encode is used to support MethodBind::bind_ptrcall. On the engine side this is called from https://github.com/godotengine/godot/blob/b7346e50258655316a4541d17fd92cc3b3a3f6ef/modules/gdscript/gdscript_vm.cpp#L1907

GET_INSTRUCTION_ARG(ret, argc + 1);
VariantInternal::initialize(ret, Variant::OBJECT);
Object **ret_opaque = VariantInternal::get_object(ret);
method->ptrcall(base_obj, argptrs, ret_opaque);
VariantInternal::update_object_id(ret);

VariantInternal::initialize guarantees that the object pointer is cleared.
In addition, I think it is always caller's duty to ensure that the dst pointer is ready to take value.

Here is a C++ snippet to test this fix

class CreateRefCounted : public godot::RefCounted {
  GDCLASS(CreateRefCounted, godot::RefCounted);

 protected:

  static void _bind_methods(void) {
    godot::ClassDB::bind_method(godot::D_METHOD("test_case_1_raw_ptr"), &CreateRefCounted::test_case_1_raw_ptr);
    godot::ClassDB::bind_method(godot::D_METHOD("test_case_2_no_workaround"), &CreateRefCounted::test_case_2_no_workaround);
    godot::ClassDB::bind_method(godot::D_METHOD("test_case_3_workaround"), &CreateRefCounted::test_case_3_workaround);
    godot::ClassDB::bind_method(godot::D_METHOD("test_case_4_variant"), &CreateRefCounted::test_case_4_variant);
  }

 public:
  
  godot::RefCounted* test_case_1_raw_ptr(void) {
    godot::RefCounted *obj = memnew(godot::RefCounted);
    return obj;
  }
  
  godot::Ref<godot::RefCounted> test_case_2_no_workaround(void) {
    godot::Ref<godot::RefCounted> obj_r;
    obj_r.instantiate();
    return obj_r;
  }
  
  godot::Ref<godot::RefCounted> test_case_3_workaround(void) {
    godot::Ref<godot::RefCounted> obj_r;
    obj_r.instantiate();
    obj_r->reference();
    return obj_r;
  }
  
  godot::Variant test_case_4_variant(void) {
    godot::Ref<godot::RefCounted> obj_r;
    obj_r.instantiate();
    return obj_r;
  }
};

There are four functions that create a new RefCounted in different ways.
The first function returns a raw pointer without using Ref at all.
The second function is the standard way to create an object, which does not work without the fix.
The third function uses a workaround by giving it an extra reference, but this causes new troubles if other C++ functions also use it.
The fourth function converts the Ref to a Variant as return, this always works as Variant preserves the ownership inside the engine.

GDScript

# Returns the ID of the given RefCounted.
# Wrapping this in a function so obj is unref and freed on return.
func check_and_unref_obj(obj) -> int:
  if obj != null:
    return obj.get_instance_id()
  else:
    return 0

func _ready():
  var summary := ""
  var create: CreateRefCounted = CreateRefCounted.new()
  var test_cases: Dictionary = {
    "test_case_1_raw_ptr": check_and_unref_obj(create.test_case_1_raw_ptr()),
    "test_case_2_no_workaround": check_and_unref_obj(create.test_case_2_no_workaround()),
    "test_case_3_workaround": check_and_unref_obj(create.test_case_3_workaround()),
    "test_case_4_variant": check_and_unref_obj(create.test_case_4_variant()),
  } # Don't use create.call(case_name), it does not use ptrcall
  
  for case_name in test_cases:
    var obj_id: int = test_cases[case_name]
    summary += "case:%-26s is_valid:%d is_freed:%d id:%016x\n" %\
      [case_name, int(obj_id != 0), int(!is_instance_id_valid(obj_id)), obj_id]
    
  print(summary)

Here is the result for Alpha13 without the fix
is_valid means the function returns a non-null object, is_freed means the object is freed after all references are lost.
Clearly, without the workaround the object cannot return properly.
Somehow with the workaround the object leaks, which I don't really understand (didn't investigate, but the hacked extra reference should have be given to the engine variant), but anyway that is not the point of this PR.

case:test_case_1_raw_ptr        is_valid:1 is_freed:1 id:-7ffffffa3afffe74
case:test_case_2_no_workaround  is_valid:0 is_freed:1 id:0000000000000000
case:test_case_3_workaround     is_valid:1 is_freed:0 id:-7ffffffa38fffe74
case:test_case_4_variant        is_valid:1 is_freed:1 id:-7ffffffa37fffe73

This is if we apply the fix. Everything works fine. Case 3 leaks the object as expected as we have extra reference.

case:test_case_1_raw_ptr        is_valid:1 is_freed:1 id:-7ffffffa3afffe74
case:test_case_2_no_workaround  is_valid:1 is_freed:1 id:-7ffffffa39fffe74
case:test_case_3_workaround     is_valid:1 is_freed:0 id:-7ffffffa38fffe74
case:test_case_4_variant        is_valid:1 is_freed:1 id:-7ffffffa37fffe73

So that is all about this fix.
Another thing I am thinking is about is the first test case when we directly return a raw pointer.
It works almost the same as returning a Ref, with one subtle difference that no one calls RefCounted::init_ref (not very sure).
As far as I know only the constructors and assignments of Variant and Ref call init_ref while ptrcall does not go through any of them.
This may expose some risks, but I don't really know about it.
So I suggest that we either fix this inconsistency or disallow returning RefCount by raw pointer. That would be another PR.

@akien-mga akien-mga added the bug This has been identified as a bug label Aug 2, 2022
@akien-mga akien-mga added this to the 4.0 milestone Aug 2, 2022
@derammo
Copy link

derammo commented Aug 15, 2022

This also fixes the problem that an uninitialized (empty) Ref<Object> was crashing on access to _owner. Note this is a normal case that has to be done for ScriptExtension::get_base_script for example. Thanks!

@bruvzg bruvzg added the topic:gdextension This relates to the new Godot 4 extension implementation label Aug 31, 2022
@rburing
Copy link
Member

rburing commented Nov 7, 2022

In discussion on RocketChat #gdextension it was mentioned this approach could cause a leak in the following test:

Ref<SomeRef> pass_and_return(Ref<SomeRef> p_ref) {
  return p_ref;
}

causing p_ref to have one "reference" too many. (Not tested.) What do you think?

@BastiaanOlij
Copy link
Collaborator

I was testing returning references yesterday and it seems to be working now (see #949).

The problem with the solution here is that the reference count is increased, with it never decreasing, resulting in leaks. It probably worked when this PR was raised because the bug on the Godot side that has been fixed wasn't increasing the reference count either, but was decreasing it, so in a way we were doing it's work here.

@BastiaanOlij
Copy link
Collaborator

Well I did some testing on this after fixing up the code a little, and it doesn't leak as I expected. Not 100% sure I understand why as there doesn't seem to be a matching deref but...

@BastiaanOlij
Copy link
Collaborator

@Faless
Copy link
Contributor

Faless commented Dec 15, 2022

Superseded by #958

@Faless Faless closed this Dec 15, 2022
@zhehangd zhehangd deleted the fix-ref-return-bug branch October 7, 2024 13:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

archived bug This has been identified as a bug topic:gdextension This relates to the new Godot 4 extension implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GDExtension Ref<T> object freeing error.

7 participants