Generic decoding is a challenging problem in visual neural decoding. The existing methods based on generative models ignore the application of prior knowledge, which leads to poor interpretability, and few pay attention to fMRI (functional Magnetic Resonance Imaging) processing. To tackle these problems, a novel framework for generic decoding has been proposed named GD-VAE. GD-VAE is based on Variational Auto-Encoder (VAE) which is capable of meaningful latent space, and contains four modules: feature extractor, feature VAE, Prior Knowledge Network (PKN) and Latent Space Disentangling Network (LSDN). The feature extractors extract features of raw visual and cognitive data, and feature VAE implements decoding with a shared latent space for both modalities. The PKN and LSDN constrain the latent space of VAE with delicate structure, in order to apparently reveal the information in the subspace. Benefiting from these modules, the alignment between visual and cognitive modality can be achieved, and greater interpretability can be acquired. Experiments on Generic Decoding Dataset validate the effectiveness and interpretability of the proposed method. |
*** Title, author list and abstract as seen in the Camera-Ready version of the paper that was provided to Conference Committee. Small changes that may have occurred during processing by Springer may not appear in this window.