Skip to content

How to make stages? #2758

@thepowerfuldeez

Description

@thepowerfuldeez

Hi! Thank you for a great framework! I've tried to write down stages for training. E.g. in my config:

    <<: *default

  stage2:
    <<: *default

    datasets:
      <<: *datasets
             # 240k
      root: ['path1',
             'path2,
             'path3]
      per_folder_ratio: [1.0, 1.0, 1.0]
      transform:
        <<: *transform
        augs_lvl: "light"

    optimizer:
      class_name: RAdam
      lr: 0.0001

and in train.py:

    for stage_name, stage in selected_stages.items():
        print(f"running stage {stage_name}")
        model = build_model(stage) # BUILD LIGHTNING MODULE from func

        trainer = build_trainer(stage_config=stage,
                                module=model,
                                ...) # MAKE pytorch_lightning Trainer using this model
        trainer.fit(model)

I'm using early stopping which make one iteration in aforementioned loop (another way which i'm also tried is to wait max_epochs epochs).
The problem is that second call to trainer.fit initializes DDP one more time and program is crashing because ip address is not freed from previous DDP init.

RuntimeError: Address already in use

I've tried master version of pytorch lightning but the problem did not dissapear

Metadata

Metadata

Assignees

No one assigned

    Labels

    discussionIn a discussion stagefeatureIs an improvement or enhancementquestionFurther information is requestedworking as intendedWorking as intended

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions