Open Access Te Herenga Waka-Victoria University of Wellington
Browse
thesis_access.pdf (2.63 MB)

Learning to Disentangle the Complex Causes of Data

Download (2.63 MB)
thesis
posted on 2021-11-23, 10:45 authored by Butler-Yeoman, Tony

The ability to extract and model the meaning in data has been key to the success of modern machine learning. Typically, data reflects a combination of multiple sources that are mixed together. For example, photographs of people’s faces reflect the subject of the photograph, lighting conditions, angle, and background scene. It is therefore natural to wish to extract these multiple, largely independent, sources, which is known as disentangling in the literature. Additional benefits of disentangling arise from the fact that the data is then simpler, meaning that there are fewer free parameters, which reduces the curse of dimensionality and aids learning.  While there has been a lot of research into finding disentangled representations, it remains an open problem. This thesis considers a number of approaches to a particularly difficult version of this task: we wish to disentangle the complex causes of data in an entirely unsupervised setting. That is, given access only to unlabeled, entangled data, we search for algorithms that can identify the generative factors of that data, which we call causes. Further, we assume that causes can themselves be complex and require a high-dimensional representation.  We consider three approaches to this challenge: as an inference problem, as an extension of independent components analysis, and as a learning problem. Each method is motivated, described, and tested on a set of datasets build from entangled combinations of images, most commonly MNIST digits. Where the results fall short of disentangling, the reasons for this are dissected and analysed. The last method that we describe, which is based on combinations of autoencoders that learn to predict each other’s output, shows some promise on this extremely challenging problem.

History

Copyright Date

2017-01-01

Date of Award

2017-01-01

Publisher

Te Herenga Waka—Victoria University of Wellington

Rights License

Author Retains Copyright

Degree Discipline

Computer Science

Degree Grantor

Te Herenga Waka—Victoria University of Wellington

Degree Level

Masters

Degree Name

Master of Science

ANZSRC Type Of Activity code

970108 Expanding Knowledge in the Information and Computing Sciences

Victoria University of Wellington Item Type

Awarded Research Masters Thesis

Language

en_NZ

Victoria University of Wellington School

School of Engineering and Computer Science

Advisors

Frean, Marcus; Marsland, Stephen