DeepMind’s AlphaCode is Now Available on GitHub
Recently, Google-backed DeepMind announced the launch of its generator model AlphaCode on GitHub, where it has made the dataset and code available.
With this latest announcement, the company has also included extensive tests on the platform to ensure the programs that pass these tests are correct — a critical feature current datasets lack.
Earlier this year, Alphacode had made waves through its potential to beat computer programmers by analysing the algorithm and generating complex programmes.
Simplifying computer programming
The developers at DeepMind tested the potential of AlphaCode by testing it in competitive programming websites where human developers are given programming problems and ranked based on their results.
One of them was a competitive coding competition on Codeforces, a popular platform for hosting coding competitions. A selection of 10 varied test problems from different stages of development was given to AlphaCode.
The AI tool achieved an estimated rank within the top 54 percentile of participants that attended the contest, thus proving that AlphaCode’s code generation system has achieved results at a competitive level.
AlphaCode vs Codex
AlphaCode is a transformer-based language model that consists of 41.4 billion parameters. It is a language model four times the size of GitHub Copilot’s language model Codex that parses 12 billion parameters only. The architecture of AlphaCode is based on three parts:
- Data: The AI tool is fed data by public GitHub repositories.
- Learning: The tool then trains on the datasets and calibrates them to the task’s requirements (e.g., competitive programming at Codeforces).
- Sampling and evaluation: Here, the AI tool performs large-scale sampling of variations of programs for each problem. Then through the process of filter and cluster, the programs are ranked into a small subset of 10 solutions that are submitted for external assessment.
AlphaCode’s AI system is pre-trained in various programming languages that include C++, C#, Go, Java, JavaScript, Lua, PHP, TypeScript, Ruby, Scala, Rust and Python. This dataset consists of approximately 715GB of codes along with their descriptions.
Through Alphacode, DeepMind has been able to fill the gap that is lacking in AI models like Codex, which is problem-solving skills. Alphacode has not only been trained to “understand” natural language but also to design complex programs and algorithms and implement them in code.
AI expert Alberto Romero said in an article that the company created five sizes of AlphaCode models which included parameters spanning 300M, 1B, 3B, 9B, and 41B. All these are named AlphaCode, but the one the organisation refers to in their communications is an ensemble of the 9B and 41B models combined with clustering.
He further said that they built models of different sizes to compare the effects of scale, training times, and compute efficiency, among other factors He adds that the model tends to program better in Python than C++ and generates a similar amount of dead code to humans.
The post <strong>DeepMind’s AlphaCode is Now Available on GitHub</strong> appeared first on Analytics India Magazine.




