Dopamine and temporal difference learning: A fruitful relationship between neuroscience and AI

Learning and motivation are driven by internal and external rewards. Many of our day-to-day behaviours are guided by predicting, or anticipating, whether a given action will result in a positive (that is, rewarding) outcome. The study of how organisms learn from experience to correctly anticipate rewards has been a productive research field for well over a century, since Ivan Pavlov’s seminal psychological work. In his most famous experiment, dogs were trained to expect food some time after a buzzer sounded. These dogs began salivating as soon as they heard the sound, before the food had arrived, indicating they’d learned to predict the reward. In the original experiment, Pavlov estimated the dogs’ anticipation by measuring the volume of saliva they produced. But in recent decades, scientists have begun to decipher the inner workings of how the brain learns these expectations. Meanwhile, in close contact with this study of reward learning in animals, computer scientists have developed algorithms for reinforcement learning in artificial systems. These algorithms enable AI systems to learn complex strategies without external instruction, guided instead by reward predictions.

The contribution of our new work, published in Nature (PDF), is finding that a recent development in computer science – which yields significant improvements in performance on reinforcement learning problems – may provide a deep, parsimonious explanation for several previously unexplained features of reward learning in the brain, and opens up new avenues of research into the brain’s dopamine system, with potential implications for learning and motivation disorders.

LIVE DEBATE – IBM Project Debater

At Intelligence Squared U.S., we’ve debated AI before – the risks, the rewards, and whether it can change the world – but for the first time, we’re debating with AI.

In partnership with IBM, Intelligence Squared U.S. is hosting a unique debate between a world-class champion debater and an AI system. IBM Project Debater is the first AI system designed to debate humans on complex topics using a combination of pioneering research developed by IBM researchers, including: data-driven speechwriting and delivery, listening comprehension, and modeling human dilemmas.

First debuted in a small closed-door event in June 2018, Project Debater will now face its toughest opponent yet in front of its largest-ever audience, with our own John Donvan in the moderator’s seat. The topic will not be revealed to Project Debater and the champion human debater until shortly before the debate begins.

Building the Software 2 0 Stack (Andrej Karpathy)

A lot of our code is in the process of being transitioned from Software 1.0 (code written by humans) to Software 2.0 (code written by an optimization, commonly in the form of neural network training). In the new paradigm, much of the attention of a developer shifts from designing an explicit algorithm to curating large, varied, and clean datasets, which indirectly influence the code. I will provide a number of examples of this ongoing transition, cover the advantages and challenges of the new stack, and outline multiple opportunities for new tooling.

Tesla Autopilot – Update Timeline

Tesla follows a similar format to semantic versioning for organizing software updates. In the version “2020.12.11.1”, 2020 is the year and 12 is the week a specific update began development, 11 is a major update, while 1 is a minor update or maintenance (bug fixes).

Collective Knowledge

Developing novel applications based on deep tech (ML, AI, HPC, quantum, IoT) and deploying them in production is a very painful, ad-hoc, time consuming and expensive process due to continuously evolving software, hardware, models, data sets and research techniques.

After struggling with these problems for many years, we started the Collective Knowledge project (CK) to decompose complex systems and research projects into reusable, portable, customizable and non-virtualized CK components with the unified automation actions, Python APIs, CLI and JSON meta descriptions.

Our idea is to gradually abstract all existing artifacts (software, hardware, models, data sets, results) and use the DevOps methodology to connect such components together into functional CK solutions. Such solutions can automatically adapt to evolving models, data sets and bare-metal platforms with the help of customizable program workflows, a list of all dependencies (models, data sets, frameworks), and a portable meta package manager.

CK is basically our intermediate language to connect researchers and practitioners to collaboratively design, benchmark, optimize and validate innovative computational systems. It then makes it possible to find the most efficient system configutations on a Pareto frontier (trading off speed, accuracy, energy, size and different costs) using an open repository of knowledge with live SOTA scoreboards and reproducible papers.

The Singularity Is Near: When Humans Transcend Biology

For over three decades, Ray Kurzweil has been one of the most respected and provocative advocates of the role of technology in our future. In his classic The Age of Spiritual Machines, he argued that computers would soon rival the full range of human intelligence at its best. Now he examines the next step in this inexorable evolutionary process: the union of human and machine, in which the knowledge and skills embedded in our brains will be combined with the vastly greater capacity, speed, and knowledge-sharing ability of our creations.