MSR-Net: Multi-Scale Relighting Network for One-to-One Relighting

Reviewer 1

The paper proposes an efficient method for one-to-one image relighting. It is much faster compared to existing methods. The occluded dark parts of the image cannot be generated by the network because of lack of data for that region. However, the method performs fairly well for translating image color temperature from input image to target image.

Is there a way to address the limitation pointed out above (and also in the conclusion), possibly by learning from example? It is true that the input image alone may not have sufficient information for the dark regions, but what about having access to an entire corpus of images beforehand to learn by example?

Reviewer 2

Key Contributions: A fast and lightweight model for one-to-one image relighting (at least 50x faster inference than similar pre-existing models considered) with competitive performance. A secondary contribution may be a proof-of-concept regarding training relighting models with the proposed combination of MAE, SSIM, Perceptual and TV losses.

Strengths: Nice presentation, generally well-written paper. Convincing evidence for considerable speed-up without detracting from performance.

Weaknesses: As addressed in #1, for the purposes of this workshop, the main weakness is the lack of discussion on differentiability in this setup. In general, the paper would benefit from more discussion on the wide array of metrics and losses considered (SSIM, Perceptual, TV, PSNR, LPIPS) so that the choices made by the authors and the results make more sense to the reader.

1) As mentioned in #2, I believe the paper would benefit from more discussion on the various metrics for relighting evaluation and discuss their benefits and drawbacks.

2) Lines 24-27 make it seem like the setup considered is restricted to transferring ‘light direction from North and color temperature 6500K’ to ‘light direction from East and 27 color temperature 4500K’. However, the dataset used varies the lightsource in 8 ways and also varies color temperatures. If you are indeed treating any transfer, then I would change the words ““for this data”” to ““for example””. Otherwise, I would address why you are limiting the scope to this particular sub-task.

3) Both in the abstract and in the conclusion, it is stated that the proposed model ““performs moderately for light gradient generation with respect to target image.”” However, I cannot find further discussion and explanation on this point within the body of the paper. Maybe this could be explored more.

Reviewer 3

This technique is strongly applicable to machine vision, not only in its highlighted use-case of relighting visual scenes, but more particularly in the technique of blending several loss functions at varying scales in a multi-step process reminiscent of ensemble models. The innovation enables a substantial improvement in performance in order to gain results of approximate quality to other methods. The references discuss de-hazing techniques (Multi-Scale Models). A brief discussion of the technique described in the paper applied to a de-hazing use case would be welcome.