This paper takes a step towards temporal reasoning in a dynamically changing video in a latent space.
Humans possess an ability to abstractly reason about objects and their interactions, an ability not shared with state-of-the-art deep learning models
We present hash embeddings, an efficient method for representing words in a continuous vector form.
Deep generative models parameterized by neural networks have achieved state-of-the-art performance in unsupervised and semi-supervised learning.
Deep generative models trained with large amounts of unlabelled data have proven to be powerful within the domain of unsupervised learning.
We present an autoencoder that leverages learned representations to better measure similarities in data space.
Document information extraction tasks performed by humans create data consisting of a PDF or document image input, and extracted string outputs.