Skip to content

A go at making the depth_packet_stream_parser better. #75

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

larshg
Copy link
Contributor

@larshg larshg commented Oct 22, 2014

Iterates through all bytes to find footer but copy data in chunks.
Doesn't use additional working buffer.

I tried to find out why I was getting a lot of errors as shown here #72 (comment) .

And I found out that its probably not about the code, but rather a hardware/software issue of my dell Precision m6600, with a Renesas USB 3.0 controller not being able to process and acquire all the data that the Kinectv2 sends out.

Also when data is transferred using ISO-mode, the packages are not guaranteed to be received either in order or at all.

I don't have any measures if this is faster/better than the previous, but it avoids using the extra working buffer.

Update: I added a FPS counter(locally) which showed my laptop running about 20(min 16, max 24). Tried the code on my stationary pc and there are a lot fewer errors and showed about 29-30 fps, hence sometimes a frame is lost. I tried a official depth app also and it runs about 29-30 fps - maybe just a bit more stable at the 30 :)

@larshg larshg force-pushed the depthpacketparse branch 4 times, most recently from 492ba42 to 0c1f09c Compare October 28, 2014 13:05
@larshg larshg mentioned this pull request Oct 28, 2014

size_t max_length = std::min<size_t>(wb.capacity - wb.length, in_length - 8);
for (size_t i = 0; i < in_length; i++)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

upper bound should be in_length - sizeof(DepthSubPacketFooter) otherwise we might cause a segfault!?

@xlz
Copy link
Member

xlz commented Feb 11, 2015

For your reference one of the image data too short bug before this patch was sometime it detected the footer in the middle of a 33792-byte transfer which has actually no footer. I found this by adding a in_length != 33792 condition to the footer detection code and the error disappeared.

@xlz
Copy link
Member

xlz commented Feb 12, 2015

Here is my take. This is a complete rewrite, please ignore the diff and just look at the new function. It has been tested extensively .

Several considerations for the rewrite:

  • Scanning magic bytes for footer is dangerous. Many image data too short errors happened because it found the magic bytes in the middle where it is not actually the footer.
  • The footer seems to only appear at the very end of a transfer. This is also consistent with RGB transfers. So far I have not seen anything in experiments that contradicts this.
  • A packet and its subpackets all have fixed sizes, therefore the work buffer can be just a pointer to the actual buffer to save many memcpy's.
  • Correct packets and subpackets must have continuous sequence and subsequence numbers. If the buffer overruns, subpackets don't have 512*424*11/8 bytes, or the (sub)sequence numbers are not continuous, it must drop the entire packet and wait until the next packet. In all these cases there is no way to determine what has been lost within the current packet and the buffer will not be clean to continue assembling the current packet.

Some additional observations:

  • ISO transfer is very sensitive to USB autosuspend. If USB autosuspend is turned on, you can barely receive any data.
  • Booting with USB plugged in or hot-plugging might make a difference. In my case I have the Kinect plugged in before booting and disabled autosuspend, I received a lot of short subpackets (less than 512*424*11/8 bytes but footers detected).

@larshg
Copy link
Contributor Author

larshg commented Feb 16, 2015

Hi @xlz

I have tried out your implementation, but I get really bad performance :(
xlzpacketparser

I can't even manage to stop and find the average framerate printout...

But I like a lot of the ideas you bring up - so I'll try to figure out why its not working at my part.

I have added the implementation on a branch in my fork here

@xlz
Copy link
Member

xlz commented Feb 16, 2015

My version should be strictly faster than the current one because it does not use an extra buffer and does not memcpy unnecessarily. I have tested it for 70 fps without data loss.

It may be that something is different on Windows that needs change. So this depends on your investigation. You can check if you have correctly disabled USB autosuspend settings, and you can try running it without visualization (comment out cv::imshow). Thanks.

@larshg
Copy link
Contributor Author

larshg commented Feb 16, 2015

Hi @xlz

The above test was on my dell precision m6600 - and I just tried it on my stationary I7 GTX480 and here it runs a lot more smooth. (I see same behavior with the current(master/my go at it) implementation - simply more processing power :) )

But I'll try to investigate why its running slower than the current implementation on the laptop.

I think have seen various data transfers sometimes on the USB - this might be the problem (ie. looking at the end/memory scanning).

I have never done anything about USB autosuspend, but I'll try find out where to change this :)

It has nothing to do with the visualization.

@xlz
Copy link
Member

xlz commented Feb 16, 2015

@larshg It might also be the slow scrolling of cmd.exe output. How is it when you comment out processor_->process(packet); and it's only stream parsing without processing?

Iterates through all bytes to find footer.
Copy data in chunks.
Doesn't use additional working buffer.
Adds timestamp
@larshg larshg force-pushed the depthpacketparse branch from 67f14c4 to 7e757b5 Compare April 8, 2015 11:24
@xlz
Copy link
Member

xlz commented Apr 18, 2015

@larshg
I got an interesting performance characteristic. In my approach above I eliminated the work buffer and copied usb buffer directly into the main packet buffer, and immediately sent the packet buffer to the depth processor. After adding in VA-API JPEG decoder, this particular order of memcpys seems to be causing performance regression, that is:

Slow (0.5x performance)

  1. Collect transfers directly into buffer_
  2. Send buffer_ to depth processor upon subsequence == 9

Fast

  1. Collect transfers into work_buffer_
  2. Send buffer_ to depth processor upon subsequence == 0
  3. Copy work_buffer_ into buffer_

I don't know what is exactly the problem, but I reverted my patch to follow the original order of memcpys.

@xlz
Copy link
Member

xlz commented May 1, 2015

After closer look at libusb ISO transfer, it seems it is not even necessary to detect the footer at all. ISO packets length will provide clue about the end of a subpacket. And zero packet length indicates splitting between packets.

An example with one transfer a line:
(max usb iso packet size 33792, 8 usb packets per usb transfer, a depth subpacket data + footer = 512*424*11/8 + 152 bytes, 10 subpackets per depth packet)

 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 33792
 33792 33792 33792 33792 33792 33792 33792 28312
 0 0 0 33792 33792 33792 33792 33792
 33792 33792 33792 28312 0 0 33792 33792
 33792 33792 33792 33792 33792 33792 28312 0
 0 33792 33792 33792 33792 33792 33792 33792
 33792 28312 0 0 33792 33792 33792 33792
 33792 33792 33792 33792 28312 0 33792 33792
 33792 33792 33792 33792 33792 33792 28312 0
 0 0 33792 33792 33792 33792 33792 33792
 33792 33792 28312 0 33792 33792 33792 33792
 33792 33792 33792 33792 28312 0 0 0
 33792 33792 33792 33792 33792 33792 33792 33792
 28312 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0
 0 33792 33792 33792 33792 33792 33792 33792
 33792 28312 0 0 0 0 0 0
 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0

Also as can be seen, the 10th auxiliary subpacket lags in time and is not actually used for the depth image.

@xlz
Copy link
Member

xlz commented May 2, 2015

@larshg
I have updated the depth stream parser in my stream-parsers branch according to findings in the above comment.

I have removed enforcing of sequence number, and mostly minimized changes made to the original code to preserve the original execution order. The only changes now are 1) removing magic footer scanning, and assuming fixed sizes.

I have tested the patch under Linux and Windows. Please see if it works for you.

@larshg larshg closed this May 5, 2015
@larshg larshg deleted the depthpacketparse branch May 5, 2015 12:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants