-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Description
hi, all, I'm trying to train a model using detectron2.
I compiled to install pytorch'1.3.0a0+e367f60', and installed tochvision'0.5.0a0+da89dad',both of which were installed from source code.
When I run the command here:https://github.com/facebookresearch/detectron2/blob/master/GETTING_STARTED.md
python tools/train_net.py --num-gpus 8
--config-file configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_1x.yaml
an error occured as follows:
Traceback (most recent call last):
File "tools/train_net.py", line 161, in
args=(args,),
File "/home/users/gaosiqi/download/detectron2/detectron2/engine/launch.py", line 49, in launch
daemon=False,
File "/home/users/gaosiqi/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 171, in spawn
while not spawn_context.join():
File "/home/users/gaosiqi/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 118, in join
raise Exception(msg)
Exception:
-- Process 5 terminated with the following error:
Traceback (most recent call last):
File "/home/users/gaosiqi/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
fn(i, *args)
File "/home/users/gaosiqi/download/detectron2/detectron2/engine/launch.py", line 84, in _distributed_worker
main_func(*args)
File "/home/users/gaosiqi/download/detectron2/tools/train_net.py", line 149, in main
return trainer.train()
File "/home/users/gaosiqi/download/detectron2/detectron2/engine/defaults.py", line 329, in train
super().train(self.start_iter, self.max_iter)
File "/home/users/gaosiqi/download/detectron2/detectron2/engine/train_loop.py", line 132, in train
self.run_step()
File "/home/users/gaosiqi/download/detectron2/detectron2/engine/train_loop.py", line 212, in run_step
loss_dict = self.model(data)
File "/home/users/gaosiqi/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 545, in call
result = self.forward(*input, **kwargs)
File "/home/users/gaosiqi/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 442, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/users/gaosiqi/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 545, in call
result = self.forward(*input, **kwargs)
File "/home/users/gaosiqi/download/detectron2/detectron2/modeling/meta_arch/rcnn.py", line 82, in forward
proposals, proposal_losses = self.proposal_generator(images, features, gt_instances)
File "/home/users/gaosiqi/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 545, in call
result = self.forward(*input, **kwargs)
File "/home/users/gaosiqi/download/detectron2/detectron2/modeling/proposal_generator/rpn.py", line 179, in forward
self.training,
File "/home/users/gaosiqi/download/detectron2/detectron2/modeling/proposal_generator/rpn_outputs.py", line 136, in find_top_rpn_proposals
keep = batched_nms(boxes.tensor, scores_per_img, lvl, nms_thresh)
File "/home/users/gaosiqi/download/detectron2/detectron2/layers/nms.py", line 17, in batched_nms
return box_ops.batched_nms(boxes, scores, idxs, iou_threshold)
File "/home/users/gaosiqi/.conda/envs/open-mmlab/lib/python3.7/site-packages/torchvision-0.5.0a0+da89dad-py3.7-linux-x86_64.egg/torchvision/ops/boxes.py", line 70, in batched_nms
keep = nms(boxes_for_nms, scores, iou_threshold)
File "/home/users/gaosiqi/.conda/envs/open-mmlab/lib/python3.7/site-packages/torchvision-0.5.0a0+da89dad-py3.7-linux-x86_64.egg/torchvision/ops/boxes.py", line 31, in nms
return torch.ops.torchvision.nms(boxes, scores, iou_threshold)
File "/home/users/gaosiqi/.conda/envs/open-mmlab/lib/python3.7/site-packages/torch/_ops.py", line 61, in getattr
op = torch._C._jit_get_operation(qualified_op_name)
RuntimeError: No such operator torchvision::nms
Environments:
centos6.3
cuda10.0
cudnn7.0.3
pytorch'1.3.0a0+e367f60'
tochvision'0.5.0a0+da89dad'