Imaginative Walks: Generative Random Walk Deviation Loss for Improved Unseen Learning Representation
Under Review

Abstract

We propose a novel loss for generative models, dubbed as GRaWD (Generative Random Walk Deviation), to improve learning representations of unexplored visual spaces. Quality learning representation of unseen classes (or styles) is critical to facilitate novel image generation and better generative understanding of unseen visual classes, i.e., zero-shot learning (ZSL). By generating representations of unseen classes based on their semantic descriptions, e.g., attributes or text, generative ZSL attempts to differentiate unseen from seen categories. The proposed GRaWD loss is defined by constructing a dynamic graph that includes the seen class/style centers and generated samples in the current minibatch. Our loss initiates a random walk probability from each center through visual generations produced from hallucinated unseen classes. As a deviation signal, we encourage the random walk to eventually land after $t$ steps in a feature representation that is difficult to classify as any of the seen classes. We demonstrate that the proposed loss can improve unseen class representation quality inductively on text-based ZSL benchmarks on CUB and NABirds datasets and attribute-based ZSL benchmarks on AWA2, SUN, and aPY datasets. In addition, we investigate the ability of the proposed loss to generate meaningful novel visual art on the WikiArt dataset. The results of experiments and human evaluations demonstrate that the proposed GRaWD loss can improve StyleGAN1 and StyleGAN2 generation quality and create novel art that is significantly more preferable. Our code is made publicly available at https://github.com/Vision-CAIR/GRaWD.

Video

Motivation

overview
Generative Random Walk Deviation loss encourages generatively visiting the orange realistic space aiming to deviate from seen classes avoiding the less real red space. Our loss starts from each seen class ( in green) performing a random walk though generated examples of hallucinated unseen classes (in orange) for T steps. We then encourage the landing representation to be far/distinguishable from seen classes. With this property, our loss help improve generalized zero-shot learning performance.
overview
Our generated art images shown on top with orange borders are generated also using our loss when considering known art movements like cubism and high renaissance as seen classes. The bottom part of this figure shows the Nearest Neighbors (NN) in the training set (with green borders) which are different.

Method: GRaWD

overview
Generative Random Walk Deviation loss starts from each seen class center (i.e., ci). It then performs a random walk through generated examples of hallucinated unseen classes using G(su; z) for T steps. The landing probability distribution of the random walk is encouraged to be uniform over the seen classes. For careful deviation from seen classes, the generated images are encouraged to be classified as real by the Discriminator D.

Quantative Results in ZSL

overview
Zero-Shot Recognition from textual description on CUB and NAB datasets (Easy and Hard Splits) showing that adding GRaWD loss can improve the performance. tr means the transductive setting.
overview
Zero-Shot Recognition on class-level attributes of AwA2, aPY and SUN datasets, showing that GRaWD loss can improve the performance on attribute-based datasets. tr means the transductive setting.

Results in Art generation

overview
Human experiments on generated art from Vanilla GAN, GRaWD and CAN losses. Models trained on our loss has the highest mean likeability in all the groups. More people believed the generated art to be real for art work gen-erated from model trained on our loss.
overview
Most liked art generated with Style-GAN trained on GRaWD.
overview
Empirical approximation of Wundt Curve. It shows that novelty will be likeable if the deviation from current is limited; if this deviation is large, people tend to dislike. The color of the data point represents a specific model and its label specifies the group named according to nomenclature in the paper. In this figure, the art from the NN↑ group has low likeability than the NN↓ group. Examples of a high and low likeability art work are shown.

Citation

If you find our work useful in your research, please consider citing:
@article{elhoseiny2021imaginative,
  title={Imaginative Walks: Generative Random Walk Deviation Loss for Improved Unseen Learning Representation},
  author={Elhoseiny, Mohamed and Jha, Divyansh and Yi, Kai and Skorokhodov, Ivan},
  journal={arXiv preprint arXiv:2104.09757},
  year={2021}
}