Skip to content

DINO SELF-SUPERVISED LEARNING: dinossl #59

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 0 commits into from

Conversation

JaejinCho
Copy link
Collaborator

@JaejinCho JaejinCho commented Jun 15, 2021

This is a work in progress [WIP]. Below are the items in order:

  • Fix parts to avoid using labels at all:

    • Sampler needs to change to avoid using labels at all.
    • Data preparation step (in shell level): NOT to concatenate the same speaker utterances. --> However, to utilize local and global view embedding contrast in training, using labels to concatenate might be better to have long utterances (need discussion). In that case, it is supervised embedding learning.
  • Hyper-parameter tuning

  • Code trimming:

    • I think this is NOT necessary to be done in this PR unless it hinders merging.
    • BTW, I have a private list related to this, which I may post later if needed.
  • *_dinossl.py were copied and edited from .py files (also *dinossl.sh from .sh)

@jesus-villalba
Copy link
Contributor

It could be good to create a new recipe directory for this separate from v1.1, for example v2, since you need to change the experimental setup

@jesus-villalba
Copy link
Contributor

About concatenating or not utterances, we are concatenating by video id, not speaker id, the question is if we allow to know the video id in our setup. I think we should follow protocol from VoxSrc challenge last year, could you find out if they allow it? They are repeating challenge this year, we should participate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants