Learning the pareto front with hypernetworks

A Navon, A Shamsian, G Chechik, E Fetaya - arXiv preprint arXiv …, 2020 - arxiv.org
… PFL implemented using HyperNetworks, which we term Pareto HyperNetworks (PHNs).
PHN learns the entire Pareto front simultaneously using a single hypernetwork, which receives …

Continual learning with hypernetworks

J Von Oswald, C Henning, BF Grewe… - arXiv preprint arXiv …, 2019 - arxiv.org
… based on task-conditioned hypernetworks, ie, networks that … data, task-conditioned
hypernetworks only require rehearsing task-… that task-conditioned hypernetworks display a very …

Neural architecture search with reinforcement learning

B Zoph, QV Le - arXiv preprint arXiv:1611.01578, 2016 - arxiv.org
Neural networks are powerful and flexible models that work well for many difficult learning
tasks in image, speech and natural language understanding. Despite their success, neural …

Graph hypernetworks for neural architecture search

C Zhang, M Ren, R Urtasun - arXiv preprint arXiv:1810.05749, 2018 - arxiv.org
… We propose Graph HyperNetwork that predicts the parameters of unseen neural networks
by directly operating on their computational graph representations. 2. Our approach achieves …

Neuroevolution of self-interpretable agents

Y Tang, D Nguyen, D Ha - Proceedings of the 2020 genetic and …, 2020 - dl.acm.org
Hypernetworks [36] suggested making the phenotype directly dependent on the inputs, thus
tailoring the weights of the phenotype to the specific inputs of the network. By incorporating …

Efficient neural architecture search via parameters sharing

H Pham, M Guan, B Zoph, Q Le… - … conference on machine …, 2018 - proceedings.mlr.press
… work (Ha et al., 2017) to generate its weight. Such usage of the hypernetwork in SMASH …
This is because the hypernetwork generates weights for SMASH’s child models via tensor …

Simple and efficient architecture search for convolutional neural networks

T Elsken, JH Metzen, F Hutter - arXiv preprint arXiv:1711.04528, 2017 - arxiv.org
… (2017) used hypernetworks (Ha et al., 2017) to generate the weights for a randomly sampled
network architecture with the goal of eliminating the costly process of training a vast amount …

Zoneout: Regularizing rnns by randomly preserving hidden activations

D Krueger, T Maharaj, J Kramár, M Pezeshki… - arXiv preprint arXiv …, 2016 - arxiv.org
We propose zoneout, a novel method for regularizing RNNs. At each timestep, zoneout
stochastically forces some hidden units to maintain their previous values. Like dropout, zoneout …

[PDF][PDF] Multiplicative interactions and where to find them

SM Jayakumar, WM Czarnecki, J Menick, J Schwarz… - 2020 - openreview.net
… “projected” context by the hypernetwork; or (c) the … hypernetwork that generates a weight
matrix for a matrix multiplication. Similarly, a diagonal 3D tensor is equivalent to a hypernetwork

Hierarchical multiscale recurrent neural networks

J Chung, S Ahn, Y Bengio - arXiv preprint arXiv:1609.01704, 2016 - arxiv.org
… Published as a conference paper at ICLR 2017 … Published as a conference paper at ICLR
2017 … JC would also like to thank Guillaume Alain, Kyle Kastner and David Ha for providing us …