Skip to content

Slow performance when using YOLO #3

@joansc

Description

@joansc

Hello!

I have started doing some early tests with this amazing and proimising project. However, Im not sure why Im having some slow performance when trying to implement the yolo pose tracking... For reference, Im using the same ExampleReceive.toe and same TopChopDatIO.tox with a slight change of having the input video of people passing by as you can see here:

image

Then, Im using a yolov8n-pose.engine with resolution of 640x320. Here's the code for the yolo.py script:

import keyboard # optional, used to quit the loop
import numpy as np
import torch
import touchpy as tp
from ultralytics import YOLO

#model = YOLO("yolov8n-pose.pt")  # load an official model
#model.export(format="engine", imgsz=(320,640))
#exit()

# tp.init_logging(level=tp.LogLevel.INFO, console=True, file=True)

torch.cuda.set_device(0)

class ExampleRunComp:
	def __init__(self):
		#self.running = True # used to gracefully exit the loop
		#self.device = torch.device('cuda')
		self.model = YOLO('yolov8n-pose.engine')

	@staticmethod
	def on_layout_change(comp, this):
		print('layout changed:')
		print('in tops:', comp.in_tops.names)
		print('out tops:', comp.out_tops.names)
		print('in chops:', comp.in_chops.names)
		print('out chops:', comp.out_chops.names)
		print('in dats:', comp.in_dats.names)
		print('out dats:', comp.out_dats.names)
		print('pars:', comp.par.names)
		# comp.out_tops[1].set_cuda_flags(tp.CudaFlags.BGRA | tp.CudaFlags.HWC)
		comp.out_tops['topOut2'].set_cuda_flags(tp.CudaFlags.BGR)

		# comp.par['Openwindow'].val = True
		return

	@staticmethod
	def on_frame(comp, this):

		if (keyboard.is_pressed('q') and keyboard.is_pressed('ctrl')):
			comp.stop() # stop running the comp
			return
		
		webcam_tensor = comp.out_tops['topOut2'].as_tensor() 

		comp.start_next_frame()
		results = this.model(webcam_tensor.unsqueeze(0),stream=True, device=0, max_det=5)
		result = next(results)

		if result is not None:
			annotatedArray = result.plot(boxes=False, labels=False)
			tensor = torch.from_numpy(annotatedArray).cuda()
			comp.in_tops['topIn1'].from_tensor(tensor, flags=tp.CudaFlags.BGR) 


	def runComp(self, tox_path):
		# create a comp object and specify a path to a tox file
		#comp = tp.Comp(tox_path)
		comp = tp.Comp(tox_path, flags=tp.CompFlags.INTERNAL_TIME_AUTO)

		comp.set_on_layout_change_callback(self.on_layout_change, self)
		comp.set_on_frame_callback(self.on_frame, self)

		comp.start() # start the comp, blocks with CompFlags.InternalTimeAuto and CompFlags.InternalTimeSemiAuto

		comp.unload() # should be called to properly unload the comp (especially if Python exits immediately after this)
		pass

# create an instance of a class that runs the comp
example = ExampleRunComp()

# run the comp
example.runComp('TopChopDatIO.tox')

When I run the script it seems everything is working fine:

image1

However, when I check the syphonout1 on ExampleReceive the stream seems slow as you can see in the next video... After seeing your presentation, when you did the demo, I see its going pretty fast, thats why its not making sense to me... Also it seems from the prints on the console that the processing is fast...

TDMovieOut2.mp4

Any idea what could it be? Im on pc windows 11, using td 2023.11600, rtx 4090

Thanks in advance,

Joan

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions