Skip to content

Conversation

@bedapisl
Copy link
Contributor

Profiling is switch on by model.profile variable in YAML config:

model:
    profile: True

Profile is saved in the log_dir of the model and can be viewed in Google Chrome at address chrome://tracing/.

@bedapisl bedapisl requested a review from FloopCZ January 28, 2019 16:52
@bedapisl bedapisl changed the title Add profiling of neural networks Profiling of neural networks Jan 28, 2019
Copy link
Contributor

@FloopCZ FloopCZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice functionality, thank you @bedapisl for pointing it out!
Can you please add the guidelines on how to use it? It should be visible here: http://emloop.org/advanced/model.html

outputs = self._session.run(fetches=fetches, feed_dict=feed_dict,
options=run_options, run_metadata=run_metadata)

with open(path.join(self._log_dir, "profile.json"), "w") as ofile:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
with open(path.join(self._log_dir, "profile.json"), "w") as ofile:
with open(path.join(self._log_dir, 'profile.json'), 'w') as ofile:

options=run_options, run_metadata=run_metadata)

if self._profile:
with open(path.join(self._log_dir, "profile.json"), "w") as ofile:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
with open(path.join(self._log_dir, "profile.json"), "w") as ofile:
with open(path.join(self._log_dir, 'profile.json'), 'w') as ofile:

"""Test frozen model restoration."""
with pytest.raises(ValueError):
FrozenModel(inputs=[], outputs=[], restore_from=tmpdir) # there is no .pb file yet
FrozenModel(log_dir="/dev/null", inputs=[], outputs=[], restore_from=tmpdir) # there is no .pb file yet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
FrozenModel(log_dir="/dev/null", inputs=[], outputs=[], restore_from=tmpdir) # there is no .pb file yet
FrozenModel(log_dir='/dev/null', inputs=[], outputs=[], restore_from=tmpdir) # there is no .pb file yet


# restore from directory
FrozenModel(**_IO, restore_from=tmpdir)
FrozenModel(log_dir="/dev/null", **_IO, restore_from=tmpdir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
FrozenModel(log_dir="/dev/null", **_IO, restore_from=tmpdir)
FrozenModel(log_dir='/dev/null', **_IO, restore_from=tmpdir)


# restore from file
FrozenModel(**_IO, restore_from=path.join(tmpdir, 'model.pb'))
FrozenModel(log_dir="/dev/null", **_IO, restore_from=path.join(tmpdir, 'model.pb'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
FrozenModel(log_dir="/dev/null", **_IO, restore_from=path.join(tmpdir, 'model.pb'))
FrozenModel(log_dir='/dev/null', **_IO, restore_from=path.join(tmpdir, 'model.pb'))


# restore from directory
frozen_model = FrozenModel(**_IO, restore_from=tmpdir, session_config={'allow_soft_placement': True})
frozen_model = FrozenModel(log_dir="/dev/null", **_IO, restore_from=tmpdir, session_config={'allow_soft_placement': True})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
frozen_model = FrozenModel(log_dir="/dev/null", **_IO, restore_from=tmpdir, session_config={'allow_soft_placement': True})
frozen_model = FrozenModel(log_dir='/dev/null', **_IO, restore_from=tmpdir, session_config={'allow_soft_placement': True})

model.save('')

frozen_model = FrozenModel(inputs=['input'], outputs=['output'], restore_from=tmpdir)
frozen_model = FrozenModel(log_dir="/dev/null", inputs=['input'], outputs=['output'], restore_from=tmpdir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
frozen_model = FrozenModel(log_dir="/dev/null", inputs=['input'], outputs=['output'], restore_from=tmpdir)
frozen_model = FrozenModel(log_dir='/dev/null', inputs=['input'], outputs=['output'], restore_from=tmpdir)


# restore from directory
frozen_model = FrozenModel(**_IO, restore_from=tmpdir, session_config={'allow_soft_placement': True})
frozen_model = FrozenModel(log_dir="/dev/null", **_IO, restore_from=tmpdir, session_config={'allow_soft_placement': True})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line too long.

if self._profile:
with open(path.join(self._log_dir, "profile.json"), "w") as ofile:
tl = timeline.Timeline(run_metadata.step_stats)
ofile.write(tl.generate_chrome_trace_format())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to overwrite the profile file on every call to run with only the statistics from this single call. Shouldn't it rather create the run_metadata variable in the constructor and consider the statistics from all the calls to run?


with open(path.join(self._log_dir, "profile.json"), "w") as ofile:
tl = timeline.Timeline(run_metadata.step_stats)
ofile.write(tl.generate_chrome_trace_format())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto statistics from a single call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this is a bit tricky. Both first (warm-up) and last (possibly smaller batch) profiles may be inaccurate. So what do we want to actually save? I guess keeping last keep_profiles: int = 5 is reasonable.

@blazekadam
Copy link
Contributor

blazekadam commented Jan 28, 2019

Whoa, @FloopCZ I thought we already gave up the " vs ' thing...

Copy link
Contributor

@blazekadam blazekadam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the code @bedapisl , I am sure it will be eventually very useful in other cases. Lets just pay a bit more attention to it so that we can marge it to this rather stable (well tested) project.

  • make log_dir optional in both models (and avoid modifying the tests)
  • implement reasonable behaviour with respect to multiple run calls (you have two suggestions from me and @FloopCZ )
  • restructure the code so that duplicities are avoided
  • explain this feature to @gdynusa and ask her for tests once you do the above
  • mention this feature in the docs as @FloopCZ suggests

Anyways, thank you again!

def __init__(self,
inputs: List[str], outputs: List[str], restore_from: str,
session_config: Optional[dict]=None, n_gpus: int=0, **_):
log_dir: str, inputs: List[str], outputs: List[str], restore_from: str,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log_dir argument should be rather optional as it was previously if I am not mistaken. Of course, we should sanitize arguments similarly to this

if profile and not log_dir:  # works for both None and empty string
    raise ValueError('log_dir has to be specified with profile set to True')

"""
Initialize new :py:class:`FrozenModel` instance.
:param log_dir: path to the logging directory (wherein models should be saved)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This docstring is inaccurate as FrozenModel cannot be saved. It is solely for the profile or is it not?

:param restore_from: restore model path (either a dir or a .pb file)
:param session_config: TF session configuration dict
:param n_gpus: number of GPUs to use (either 0 or 1)
:param profile: whether profile.json should be saved to log_dir
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
:param profile: whether profile.json should be saved to log_dir
:param profile: if true, profile the speed of model inference and save profile.json to the specified log_dir

self._graph = tf.Graph()
if session_config:
session_config = tf.ConfigProto(**session_config)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this intentional?


with open(path.join(self._log_dir, "profile.json"), "w") as ofile:
tl = timeline.Timeline(run_metadata.step_stats)
ofile.write(tl.generate_chrome_trace_format())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this is a bit tricky. Both first (warm-up) and last (possibly smaller batch) profiles may be inaccurate. So what do we want to actually save? I guess keeping last keep_profiles: int = 5 is reasonable.

"""Name of the monitored signal variance tensor/output."""

def __init__(self, # pylint: disable=too-many-arguments
dataset: Optional[el.AbstractDataset], log_dir: Optional[str], inputs: List[str], outputs: List[str],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make the log_dir indeed optional with default None.

Copy link
Contributor Author

@bedapisl bedapisl Feb 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this in some way related to this pull request?

:param monitor: monitor signal mean and variance of the tensors which names contain the specified value
:param restore_fallback: ignored arg. (allows training from configs saved by emloop where it is added)
:param clip_gradient: limit the absolute value of the gradient; set to None for no clipping
:param profile: whether profile.json should be saved to log_dir
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto description

for output_name in self.output_names:
fetches.append(tower[output_name])

run_options = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more or less the same code as in FrozenModel. Can you perhaps wrap session.run to some util function which would be utilized in both classes?

project = 'emloop-tensorflow'
copyright = '2018, Iterait a.s.'
author = 'Blazek Adam, Belohlavek Petr, Matzner Filip'
author = 'Blazek Adam, Belohlavek Petr, Matzner Filip, Bedrich Pisl'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice except that Beda's name and surname are in the wrong order. :-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whatever

@@ -0,0 +1,27 @@
Profiling networks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the docs.

@blazekadam blazekadam merged commit 6b2e648 into dev Feb 9, 2019
@blazekadam blazekadam deleted the profiling branch February 9, 2019 13:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants