GitXplorerGitXplorer
p

videoMultiGAN

public
64 stars
23 forks
1 issues

Commits

List of commits on branch master.
Unverified
60a91f6e5246e1d98763a4297c39b2140adee9bd

README update

pprannayk committed 7 years ago
Unverified
685454c000daa187abe23f3c09cf9d49edee4b9b

Pushing gif

pprannayk committed 7 years ago
Unverified
31d7b3462d1404c7a7dc505c40c06e1373864be1

Push gifs

pprannayk committed 7 years ago
Unverified
0aecfec9236ddffe5d50fe8cfa0dffab8c3ab72f

Rolling

pprannayk committed 7 years ago
Unverified
704cd5434c1f592fd3f999d6c82c4fc50739f7c3

Adding relevant images

pprannayk committed 7 years ago
Unverified
094c673190738982ba69b524605a15d8374b8096

adding

pprannayk committed 7 years ago

README

The README file for this repository.

Video Multi GAN

Video Generation from Text using Tree like decision using GANs. The text annotation or statement is encoded using the LM into a embedding, which then is combined with random vector to generate relevant videos and images.

Video Generation models

  1. VAEGAN
  2. VAEGAN with Latent Variable optimization
  3. VAEGAN with anti reconstruction loss
  4. VAEGAN + Anti reconstruction loss + Latent variable models
  5. variants of above models with different Hyper parameters

Model structure

  • LSTM based model for next frame creation
  • Wasserstein GAN setting discriminator
  • Word embedding based LM
  • Attention based model for classification structure

Training model

  • The relevant models are in Tensorflow >= v1.2
  • Experimentation with above mentioned models
  • The training is done over self generated Bouncing MNIST with sentence based annotation
  • The gensim pre trained fastText wikipedia work embeddings are used for embedding tokens as vectors
  • Non attention based models are used initially to generate starting frames.
  • The GAN tree trains to look for discriminative features (unverified)

Datasets

  1. UCF101 : 3 channel image
  2. Bouncing MNIST

Documentation

  1. We use Sync-DRAW to develop our datasets (https://github.com/syncdraw/Sync-DRAW)
  2. UCF101 is available from University of Montreal
  3. We use multiple GPU training (or a single K80 or Titan X)
  4. Cluster traning is impossible for now

Results will not be updated here since there might be related publications.