Decays the learning rate of each parameter group by gamma every epoch. When last_epoch=-1, sets initial lr as lr.