Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -741,7 +741,14 @@ public void refreshNodeAttributesToScheduler(NodeId nodeId) {
if (host == null || host.attributes == null) {
return;
}
newNodeToAttributesMap.put(hostName, host.attributes.keySet());
// Use read lock and create defensive copy since
// other threads might access host.attributes
readLock.lock();
try {
newNodeToAttributesMap.put(hostName, new HashSet<>(host.attributes.keySet()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Judging from the stack, the problem occurred when the log was printed, and the wrong line was modified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that.

  1. The LOG statement which prints newNodeToAttributesMap tries to iterate host.attribute
  2. host.attribute gets modified by some other thread - leading to concurrent modification exception.

There are two ways to solve this

  1. As you said to readLock before LOG statement so that host.attribute does not get modified during LOG statement
  2. Create a defensive copy of host.attribute (under read lock because the modification can happen at that time as well).

The rationale behind using option 2 to avoid logging inconsistency- Assume that we readLock before LOG statement. Once the LOG statement is executed, some other thread modifies the host.attribute this will lead to we logging something and processing something else.

Creating a defensive copy make sure that we don't change value. i.e what is LOGed gets processed as well.

Copy link

Copilot AI Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating a new HashSet on every call could impact performance. Consider using Collections.unmodifiableSet() if the caller doesn't need to modify the returned set, or implement a more efficient copying mechanism for frequently called methods.

Suggested change
newNodeToAttributesMap.put(hostName, new HashSet<>(host.attributes.keySet()));
newNodeToAttributesMap.put(hostName, Collections.unmodifiableSet(new HashSet<>(host.attributes.keySet())));

Copilot uses AI. Check for mistakes.
} finally {
readLock.unlock();
}

// Notify RM
if (rmContext != null && rmContext.getDispatcher() != null) {
Expand Down