World-GAN : a Generative Model for Minecraft Worlds

개 발광 어·2023년 5월 18일
0

Generative_Model

목록 보기
3/5

Motivation

After taking leave of absence of unviersity, I recently got into playing Minecraft. Reminding me of my childhood memory, at the same time I have realized that so many features have changed and updated. While enjoying my life in cubic 3D world, I wondered what would it look like if the Minecraft world were generated by generative models.

Now that I have read about the GAN, I searched if there are any generative models related to the game and found out an interesting model : World-GAN.

World-GAN

What is World-GAN?

World-GAN is a generative model for generating Minecraft worlds. From a single example, it can perform PCGML in Minecraft. It uses the block2vec representation, motivated by the word2vec and the dense representation of NLP. Via block2vec, World-GAN is able to generate worlds in large levels based on parts of users' creations.

To understand how the World-GAN works, reading the following previous works would also be helpful :

1. Sin-GAN (Shaham, Dekel, and Michaeli 2019)

  • GAN architecture, learning from a single natural image
  • Cascade of fully convolutional generators and discriminators patched in diverse scales

2. TOAD-GAN (M. Awiszus, F. Schubert, and B. Rosenhahn 2020)

  • replaced bilinear downsampling in Sin-GAN to a special downsampling operation
  • determine the importance of token using a hierarchy that
    is constructed by a heuristic, motivated by the TF-IDF metric from NLP
  • applied to 2D token-based games like Super Mario

Problem Scenarios

The main problem of applying TOAD-GAN directly into world generating can be summarized into two parts.

  1. The conversion from 2D to 3D leading to dramatic increase in size of samples.
  2. A large variety of tokens in Minecraft, and their long-tail distribution
    • sometimes aliasing low-frequency tokens, which are significant and should not be ignored

block2vec

In order to resolve the above problems, the paper suggests a new token embedding method called block2vec.

Say that there is a token bib_i in a given training sample, and let f(bi)f(b_i) the frequency of that token. Then the occurence probability of the token can be written as :

P(bi)=f(bi)0.001+1×0.001f(bi)P(b_i)=\sqrt{\frac{f(b_i)}{0.001}+1}\times \frac{0.001}{f(b_i)}

By sampling the tokens according to the P(bi)P(b_i), it can mitigate the issue of token imbalance.

Three advantages of using block2vec

  1. Reduced memory requirements
  2. Omit definition of hierarchy
    • visualized token embeddings (dimension reduced to 2 by MDE technique)
    • rare tokens are placed close to semantically similar more common tokens
  3. Choosing a different mapping from internal representations to tokens allows us to change the style of the generated content after training

Training

  1. skip-gram model with two linear layers predicting context from the target token
  2. Generator produces m×D×H×Wm\times D\times H\times W tensor
  3. Tensor fed to the discriminator

Experiment Results

Qualitative

Quantitative

profile
개발하는 광어입니다.

0개의 댓글