International Conference on Learning Representations (ICLR) recently announced the ICLR 2021 Outstanding Paper Awards winners. It recognised eight papers out of the 860 submitted this year.

The papers were evaluated for both technical quality and the potential to create a practical impact.

The committee was chaired by Ivan Titov (U. Edinburgh/U. Amsterdam), Naila Murray (Facebook AI Research), and Alice Oh (KAIST), and Senior Program Chair Katja Hofmann (Microsoft Research).

Here are the top outstanding papers at ICLR 2021

Beyond Fully-Connected Layers with Quaternions: Parameterisation of Hypercomplex Multiplications with 1/n Parameters 

By Aston Zhang, Yi Tay, Shuai Zhang, Alvin Chan, Anh Tuan Luu, Siu Hui, and Jie Fu

This paper deals with parameterising hypercomplex multiplications using arbitrarily learnable parameters compared with the fully-connected layer counterpart. It allows models to learn multiplication rules from data regardless of whether such rules are predefined. This method by researchers not only subsumes the Hamilton product but also learns to operate on any arbitrary hypercomplex space, providing more architectural flexibility using arbitrarily learnable parameters compared with the fully-connected layer counterpart. This can be used in applications around LSTM and transformer models on natural language inference, machine translation, text style transfer, and more.

Check the paper here.

By Erik Arakelyan, Daniel Daza, Pasquale Minervini, and Michael Cochez

In this paper, researchers deliberated on solving complex queries by answering the sub-queries using neural link predictors to tackle the optimisation problem. Neural link predictors are used for identifying missing edges in large scale Knowledge Graphs. However, it has not been explored for answering more complex problems such as queries using logical conjunctions, disjunctions and existential quantifiers. The researchers proposed a framework for efficiently answering complex queries on incomplete Knowledge Graphs. Researchers have translated each query into an end-to-end differentiable objective where the true value of each atom is computed by a pre-trained neural link predictor. 

Check the paper here. 

EigenGame: PCA as a Nash Equilibrium 

By Ian Gemp, Brian McWilliams, Claire Vernade, and Thore Graepel

In this paper, researchers presented a novel view on principal components analysis as a competitive game where each approximate eigenvector is controlled by a player whose goal is to maximise their own utility function. They developed an algorithm that combines the elements from Oja’s rule with a generalised Gram-Schmidt orthogonalisation. Researchers demonstrated the scalability of the algorithm with experiments on large image datasets and neural network activations. The researchers discussed how this new view of PCA as a differentiable game can lead to further algorithmic developments and insights.

Check the paper here.

Learning Mesh-Based Simulation with Graph Networks 

By Tobias Pfaff, Meire Fortunato, Alvaro Sanchez-Gonzalez, and Peter Battaglia

Mesh-based simulations are tapped to model complex physical systems in many disciplines such as science and engineering. It supports powerful numerical integration methods and is used to strike favourable trade-offs between accuracy and efficiency. However, high-dimensional scientific simulations are very expensive to run. To overcome this challenge, researchers introduced MeshGraphNets, a framework for learning mesh-based simulations using graph neural networks.

Check the paper here. 

Neural Synthesis of Binaural Speech from Mono Audio 

By Alexander Richard, Dejan Markovic, Israel D. Gebru, Steven Krenn, Gladstone Alexander Butler, Fernando Torre, and Yaser Sheikh

In this paper, researchers touched upon a neural rendering approach for binaural sound synthesis to produce realistic and spatially accurate binaural sound in real-time. The researchers investigated deficiencies of the l2-loss on raw waveforms in theoretical analysis and introduced an improved loss that overcomes these limitations. It is one of the first approaches to generate spatially accurate waveform outputs, and it outperforms existing approaches by a considerable margin — both quantitatively and in a perceptual study.

Check the paper here

Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime 

By Atsushi Nitanda, and Taiji Suzuki

In this paper, researchers analysed the convergence of the averaged stochastic gradient descent for overparameterised two-layer neural networks for regression problems. The research is based on the premise that the neural tangent kernel (NTK) plays an important role in showing the global convergence of gradient-based methods under the NTK regime. However, researchers believe there is still room for a convergence rate analysis in the NTK regime. The researchers showed averaged stochastic gradient descent could achieve the minimax optimal convergence rate, with the global convergence guarantee, by exploiting the target function’s complexities. 

Read the paper here

Rethinking Architecture Selection in Differentiable NAS

By Ruochen Wang, Minhao Cheng, Xiangning Chen, Xiaocheng Tang, and Cho-Jui Hsieh

In this paper, researchers have explored Differentiable Neural Architecture Search, one of the most popular Neural Architecture Search (NAS) methods known for its search efficiency and simplicity. While much has been discussed about the supernet’s optimisation, the architecture selection process has received little attention. Researchers provided empirical and theoretical analysis to show the magnitude of architecture parameters does not necessarily indicate how much the operation contributes to the supernet’s performance. The researchers proposed an alternative perturbation-based architecture selection that directly measures each operation’s influence on the supernet.

Check the paper here.

Score-Based Generative Modeling through Stochastic Differential Equations

By Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole

In this paper, researchers represented a stochastic differential equation (SDE) that smoothly transforms a complex data distribution to a known prior distribution by slowly injecting noise and a corresponding reverse-time SDE that transforms the prior distribution back into the data distribution by slowly removing the noise. Researchers have worked around the fact that creating noise from data is easy and that creating data from noise is generative modelling. Crucially, the reverse-time SDE depends only on the time-dependent gradient field. By leveraging advances in score-based generative modelling, researchers accurately estimated these scores with neural networks and used numerical SDE solvers to generate samples. Researchers also demonstrated a new way to solve inverse problems with score-based models. 

Check the paper here.

The post 8 Outstanding Papers At ICLR 2021 appeared first on Analytics India Magazine.