

class ignite.handlers.ModelCheckpoint(dirname, filename_prefix, save_interval=None, score_function=None, score_name=None, n_saved=1, atomic=True, require_empty=True, create_dir=True, save_as_state_dict=False)[source]#

ModelCheckpoint handler can be used to periodically save objects to disk.

This handler accepts two arguments:

  • an ignite.engine.Engine object

  • a dict mapping names (str) to objects that should be saved to disk.

See Notes and Examples for further details.

  • dirname (str) – Directory path where objects will be saved

  • filename_prefix (str) – Prefix for the filenames to which objects will be saved. See Notes for more details.

  • save_interval (int, optional) – if not None, objects will be saved to disk every save_interval calls to the handler. Exactly one of (save_interval, score_function) arguments must be provided.

  • score_function (Callable, optional) – if not None, it should be a function taking a single argument, an ignite.engine.Engine object, and return a score (float). Objects with highest scores will be retained. Exactly one of (save_interval, score_function) arguments must be provided.

  • score_name (str, optional) – if score_function not None, it is possible to store its absolute value using score_name. See Notes for more details.

  • n_saved (int, optional) – Number of objects that should be kept on disk. Older files will be removed.

  • atomic (bool, optional) – If True, objects are serialized to a temporary file, and then moved to final destination, so that files are guaranteed to not be damaged (for example if exception occures during saving).

  • require_empty (bool, optional) – If True, will raise exception if there are any files starting with filename_prefix in the directory ‘dirname’

  • create_dir (bool, optional) – If True, will create directory ‘dirname’ if it doesnt exist.

  • save_as_state_dict (bool, optional) – If True, will save only the state_dict of the objects specified, otherwise the whole object will be saved.


This handler expects two arguments: an Engine object and a dict mapping names to objects that should be saved.

These names are used to specify filenames for saved objects. Each filename has the following structure: {filename_prefix}_{name}_{step_number}.pth. Here, filename_prefix is the argument passed to the constructor, name is the key in the aforementioned dict, and step_number is incremented by 1 with every call to the handler.

If score_function is provided, user can store its absolute value using score_name in the filename. Each filename can have the following structure: {filename_prefix}_{name}_{step_number}_{score_name}={abs(score_function_result)}.pth. For example, score_name=”val_loss” and score_function that returns -loss (as objects with highest scores will be retained), then saved models filenames will be model_resnet_10_val_loss=0.1234.pth.


>>> import os
>>> from ignite.engine import Engine, Events
>>> from ignite.handlers import ModelCheckpoint
>>> from torch import nn
>>> trainer = Engine(lambda batch: None)
>>> handler = ModelCheckpoint('/tmp/models', 'myprefix', save_interval=2, n_saved=2, create_dir=True)
>>> model = nn.Linear(3, 3)
>>> trainer.add_event_handler(Events.EPOCH_COMPLETED, handler, {'mymodel': model})
>>>[0], max_epochs=6)
>>> os.listdir('/tmp/models')
['myprefix_mymodel_4.pth', 'myprefix_mymodel_6.pth']
class ignite.handlers.EarlyStopping(patience, score_function, trainer)[source]#

EarlyStopping handler can be used to stop the training if no improvement after a given number of events

  • patience (int) – Number of events to wait if no improvement and then stop the training

  • score_function (Callable) – It should be a function taking a single argument, an ignite.engine.Engine object, and return a score float. An improvement is considered if the score is higher.

  • trainer (Engine) – trainer engine to stop the run if no improvement


from ignite.engine import Engine, Events
from ignite.handlers import EarlyStopping

def score_function(engine):
    val_loss = engine.state.metrics['nll']
    return -val_loss

handler = EarlyStopping(patience=10, score_function=score_function, trainer=trainer)
# Note: the handler is attached to an *Evaluator* (runs one epoch on validation dataset)
evaluator.add_event_handler(Events.COMPLETED, handler)
class ignite.handlers.Timer(average=False)[source]#

Timer object can be used to measure (average) time between events.


average (bool, optional) – if True, then when .value() method is called, the returned value will be equal to total time measured, divided by the value of internal counter.


total time elapsed when the Timer was running (in seconds)




internal counter, usefull to measure average time, e.g. of processing a single batch. Incremented with the .step() method.




flag indicating if timer is measuring time.




When using Timer(average=True) do not forget to call timer.step() everytime an event occurs. See the examples below.


Measuring total time of the epoch:

>>> from ignite.handlers import Timer
>>> import time
>>> work = lambda : time.sleep(0.1)
>>> idle = lambda : time.sleep(0.1)
>>> t = Timer(average=False)
>>> for _ in range(10):
...    work()
...    idle()
>>> t.value()

Measuring average time of the epoch:

>>> t = Timer(average=True)
>>> for _ in range(10):
...    work()
...    idle()
...    t.step()
>>> t.value()

Measuring average time it takes to execute a single work() call

>>> t = Timer(average=True)
>>> for _ in range(10):
...    t.resume()
...    work()
...    t.pause()
...    idle()
...    t.step()
>>> t.value()

Using the Timer to measure average time it takes to process a single batch of examples

>>> from ignite.engine import Engine, Events
>>> from ignite.handlers import Timer
>>> trainer = Engine(training_update_function)
>>> timer = Timer(average=True)
>>> timer.attach(trainer,
...              start=Events.EPOCH_STARTED,
...              resume=Events.ITERATION_STARTED,
...              pause=Events.ITERATION_COMPLETED,
...              step=Events.ITERATION_COMPLETED)
attach(engine, start=Events.STARTED, pause=Events.COMPLETED, resume=None, step=None)[source]#

Register callbacks to control the timer.


self (Timer)

class ignite.handlers.TerminateOnNan(output_transform=<function TerminateOnNan.<lambda>>)[source]#

TerminateOnNan handler can be used to stop the training if the process_function’s output contains a NaN or infinite number or torch.tensor. The output can be of type: number, tensor or collection of them. The training is stopped if there is at least a single number/tensor have NaN or Infinite value. For example, if the output is [1.23, torch.tensor(…), torch.tensor(float(‘nan’))] the handler will stop the training.


output_transform (Callable, optional) – a callable that is used to transform the ignite.engine.Engine’s process_function’s output into a number or torch.tensor or collection of them. This can be useful if, for example, you have a multi-output model and you want to check one or multiple values of the output.


trainer.add_event_handler(Events.ITERATION_COMPLETED, TerminateOnNan())