Skip to content

Conversation

javeme
Copy link

@javeme javeme commented Dec 30, 2016

Failed to traverse Iterable values with foreach at the second time in reduce() method, because the second foreach-loop was not executed.

JIRA MAPREDUCE-6827

The following code is a reduce() method (of WordCount):

public static class WcReducer extends Reducer<Text, IntWritable, Text, IntWritable> {

	@Override
	protected void reduce(Text key, Iterable<IntWritable> values, Context context)
			throws IOException, InterruptedException {

		// print some logs
		List<String> vals = new LinkedList<>();
		for(IntWritable i : values) {
			vals.add(i.toString());
		}
		System.out.println(String.format(">>>> reduce(%s, [%s])",
				key, String.join(", ", vals)));

		// sum of values
		int sum = 0;
		for(IntWritable i : values) {
			sum += i.get();
		}
		System.out.println(String.format(">>>> reduced(%s, %s)",
				key, sum));

		context.write(key, new IntWritable(sum));
	}
}

After running it, we got the result that the value of the variable sum is always 0!

After debugging, it was found that the second foreach-loop was not executed, and the root cause was the returned value of Iterable.iterator(), it returned the same instance in the two calls called by foreach-loop. In general, Iterable.iterator() should return a new instance in each call, such as ArrayList.iterator(). This patch fixed the bug.

Signed-off-by: Javeme [email protected]

… reduce() method

The following code is a reduce() method (of WordCount):

	public static class WcReducer extends Reducer<Text, IntWritable, Text, IntWritable> {

		@OverRide
		protected void reduce(Text key, Iterable<IntWritable> values, Context context)
				throws IOException, InterruptedException {

			// print some logs
			List<String> vals = new LinkedList<>();
			for(IntWritable i : values) {
				vals.add(i.toString());
			}
			System.out.println(String.format(">>>> reduce(%s, [%s])",
					key, String.join(", ", vals)));

			// sum of values
			int sum = 0;
			for(IntWritable i : values) {
				sum += i.get();
			}
			System.out.println(String.format(">>>> reduced(%s, %s)",
					key, sum));

			context.write(key, new IntWritable(sum));
		}
	}

After running it, we got the result that all sums were zero!

After debugging, it was found that the second foreach-loop was not executed, and the root cause was the returned value of Iterable.iterator(), it returned the same instance in the two calls by foreach-loop. In general, Iterable.iterator() should return a new instance in each call, such as ArrayList.iterator(). This patch fixed the bug.

Signed-off-by: Javeme <[email protected]>
@javeme
Copy link
Author

javeme commented Dec 30, 2016

NOTE: The following is a test about foreach with int[]/ArrayList, and the test results are expected(the second for-loop is also executed correctly):

import java.util.ArrayList;

public class TestForeach {

	public static void main(String[] args) {
		
		// test foreach twice with int[]
		int list1[] = new int[]{1, 2};
		
		System.out.println("==== int[] 1");
		for(int i : list1) {
			System.out.println(i);
		}
		
		System.out.println("===int[] 2");
		for(int i : list1) {
			System.out.println(i);
		}
		
		// test foreach twice with ArrayList
		ArrayList<String> list = new ArrayList<String>();
		list.add("1");
		list.add("2");
		Iterable<String> list2 = list;

		System.out.println();
		System.out.println("===ArrayList 1");
		for(String i : list2) {
			System.out.println(i);
		}
		
		System.out.println("===ArrayList 2");
		for(String i : list2) {
			System.out.println(i);
		}
	}

}

@javeme javeme changed the title MAPREDUCE-6827. Failed to traverse Iterable values the second time in… MAPREDUCE-6827. Fixed bug: the second foreach-loop was not executed Jan 4, 2017
@javeme
Copy link
Author

javeme commented Jan 4, 2017

According to Daniel Templeton, we think it is expected.

@javeme javeme closed this Jan 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant