-
Notifications
You must be signed in to change notification settings - Fork 769
Simplifying decodePixelMeasurement #405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks, this is certainly helpful. The code was so convoluted because it was originally ported from the shader implementation. However, I'll hold this for 0.2 at the moment. |
rebased to new master |
I think correctness is paramount here because this is the source of other reimplementation. These are many commits to review. I hope the refactored code is provably equivalent. |
@@ -332,63 +326,26 @@ class CpuDepthPacketProcessorImpl: public WithPerfLogging | |||
|
|||
int32_t decodePixelMeasurement(unsigned char* data, int sub, int x, int y) | |||
{ | |||
if (x < 1 || y < 0 || 510 < x || 423 < y) | |||
{ | |||
return lut11to16[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lut11to16[0] = 0
I did my own derivation and the refactoring seems correct. There are just too many commits, which kind of pollute the commit history. |
Can you still squash them after creating a PR? |
Yes |
(It seems the line note with the Python computation went missing due to my push, I'll paste it below) I see no reason to discard Python computed what happens for x = 511: >>> x = 511
>>> r1zi = (x >> 2) + ((x & 0x3) << 7)
>>> r1zi = r1zi * 11
>>> r1yi = r1zi >> 4
>>> r1yi
351
>>> r1zi = r1zi & 15
>>> r1zi
5 The >>> i1 = 0 # Real data 0
>>> i2 = 0xffff # Garbage
>>> i1 = i1 >> r1zi
>>> i2 = i2 << (16 - r1zi)
>>> (i1 | i2) & 2047
0 # No garbage bits It seems The filter stages that follow copy the edge pixels. Allowing the Given the result of the Python computation for It should work for this function, not sure what happens at other places in the code. I'll have a look at cleaning up the commits, but not immediately. |
ptr[r1yi + 1] reads beyond array end. This is a invalid memory access, not arithmetic. Unless there is evidence that the edge pixels are inherently invalid, I believe it's just making the code of less branches. Filter1 does need a followup look on its edge cases. |
I think the last commit is good. The Also now that it does not discard edge pixels, do those pixels contain sensible values (with filters disabled)? |
Testing is a next step for me, but unfortunately, that is currently complicated, as my usb is not stable. It will need a somewhat simpler testbed than the freenect2 application. I don't expect the memory access to make much difference, you'd need the next value soon for a next pixel in general. Fixing the jumping around in memory access due to the |
Due to known range of the x coordinate, "rizi >> 4" cannot go beyond 352. The only way to get there is due to having an out-of-bound pixel (x, y) coordinate. Therefore, "return lut11to16[0]" happens only for a true boolean condition.
Compressed the refactoring into 3 commits now. Tricky commit (commit 767ba4a) is tricky, and that commit has an explanation in the message. Ran tests without filter-stages, between current depth calculation and with #404 and this PR (without updating all pixels), and output as well as execution time is the same. If you add filtering, total execution time grows, so the execution time changes get even smaller. |
I mean if the edge pixels are physically valid values. |
I'll give this some testing next Monday and merge it, #404 too. |
I removed the last commit "Copy pixels at x==0 || x== 511 as well." and cherry picked the two PRs into the master. |
There seems to be no change in performance (both are very slow - 6Hz as I tested). |
Not sure whether you want this, but I simplified the decodePixelMeasurement a lot.