Learning the pareto front with hypernetworks
… PFL implemented using HyperNetworks, which we term Pareto HyperNetworks (PHNs).
PHN learns the entire Pareto front simultaneously using a single hypernetwork, which receives …
PHN learns the entire Pareto front simultaneously using a single hypernetwork, which receives …
Continual learning with hypernetworks
… based on task-conditioned hypernetworks, ie, networks that … data, task-conditioned
hypernetworks only require rehearsing task-… that task-conditioned hypernetworks display a very …
hypernetworks only require rehearsing task-… that task-conditioned hypernetworks display a very …
Neural architecture search with reinforcement learning
Neural networks are powerful and flexible models that work well for many difficult learning
tasks in image, speech and natural language understanding. Despite their success, neural …
tasks in image, speech and natural language understanding. Despite their success, neural …
Graph hypernetworks for neural architecture search
… We propose Graph HyperNetwork that predicts the parameters of unseen neural networks
by directly operating on their computational graph representations. 2. Our approach achieves …
by directly operating on their computational graph representations. 2. Our approach achieves …
Neuroevolution of self-interpretable agents
… Hypernetworks [36] suggested making the phenotype directly dependent on the inputs, thus
tailoring the weights of the phenotype to the specific inputs of the network. By incorporating …
tailoring the weights of the phenotype to the specific inputs of the network. By incorporating …
Efficient neural architecture search via parameters sharing
… work (Ha et al., 2017) to generate its weight. Such usage of the hypernetwork in SMASH …
This is because the hypernetwork generates weights for SMASH’s child models via tensor …
This is because the hypernetwork generates weights for SMASH’s child models via tensor …
Simple and efficient architecture search for convolutional neural networks
… (2017) used hypernetworks (Ha et al., 2017) to generate the weights for a randomly sampled
network architecture with the goal of eliminating the costly process of training a vast amount …
network architecture with the goal of eliminating the costly process of training a vast amount …
Zoneout: Regularizing rnns by randomly preserving hidden activations
We propose zoneout, a novel method for regularizing RNNs. At each timestep, zoneout
stochastically forces some hidden units to maintain their previous values. Like dropout, zoneout …
stochastically forces some hidden units to maintain their previous values. Like dropout, zoneout …
[PDF][PDF] Multiplicative interactions and where to find them
… “projected” context by the hypernetwork; or (c) the … hypernetwork that generates a weight
matrix for a matrix multiplication. Similarly, a diagonal 3D tensor is equivalent to a hypernetwork …
matrix for a matrix multiplication. Similarly, a diagonal 3D tensor is equivalent to a hypernetwork …
Hierarchical multiscale recurrent neural networks
… Published as a conference paper at ICLR 2017 … Published as a conference paper at ICLR
2017 … JC would also like to thank Guillaume Alain, Kyle Kastner and David Ha for providing us …
2017 … JC would also like to thank Guillaume Alain, Kyle Kastner and David Ha for providing us …