New twists in behavioral association theories as worm turns

Physicists have developed a dynamical model of animal behavior that may explain some mysteries surrounding associative learning going back to Pavlov's dogs. The Proceedings of the National Academy of Sciences (PNAS) published the findings, based on experiments on a common laboratory organism, the roundworm C. elegans.

"We showed how learned associations are not mediated by just the strength of an association, but by multiple, nearly independent pathways — at least in the worms," says Ilya Nemenman, an Emory professor of physics and biology whose lab led the theoretical analyses for the paper. "We expect that similar results will hold for larger animals as well, including maybe in humans."

"Our model is dynamical and multi-dimensional," adds William Ryu, an associate professor of physics at the Donnelly Centre at the University of Toronto, whose lab led the experimental work. "It explains why this example of associative learning is not as simple as forming a single positive memory. Instead, it's a continuous interplay between positive and negative associations that are happening at the same time."

First author of the paper is Ahmed Roman, who worked on the project as an Emory graduate student and is now a postdoctoral fellow at the Broad Institute. Konstaintine Palanski, a former graduate student at the University of Toronto, is also an author.

The conditioned reflex

More than 100 years ago, Ivan Pavlov discovered the "conditioned reflex" in animals through his experiments on dogs. For example, after a dog was trained to associate a sound with the subsequent arrival of food, the dog would start to salivate when it heard the sound, even before the food appeared.

About 70 years later, psychologists built on Pavlov's insights to develop the Rescorla-Wagner model of classical conditioning. This mathematical model describes conditioned associations by their time-dependent strength. That strength increases when the conditioned stimulus (in Pavlov dog's case the sound) can be used by the animal to decrease the surprise in the arrival of the unconditioned response (the food).

Such insights helped set the stage for modern theories of reinforcement learning in animals, which in turn enabled reinforcement learning algorithms in artificial intelligence systems. But many mysteries remain, including some related to Pavlov's original experiments.

After Pavlov trained dogs to associate the sound of a bell with food he would then repeatedly expose them to the bell without food. During the first few trials without food, the dogs continued to salivate when the bell rang. If the trials continued long enough, the dogs "unlearned" and stopped salivating in response to the bell. The association was said to be "extinguished."

Teasing out the puzzle

Pavlov discovered, however, that if he waited a while and then retested the dogs, they would once again salivate in response to the bell, even if no food was present. Neither Pavlov nor more recent associative-learning theories could accurately explain or mathematically model this spontaneous recovery of an extinguished association.

Researchers have explored such mysteries through experiments with C. elegans. The one-millimeter roundworm only has about 1,000 cells and 300 of them are neurons. That simplicity provides scientists with a simple system to test how the animal learns. At the same time, C. elegans' neural circuitry is just complicated enough to connect some of the insights gained from studying its behavior to more complex systems.

Earlier experiments have established that C. elegans can be trained to prefer a cooler or warmer temperature by conditioning it at a certain temperature with food. In a typical experiment, the worms are placed in a petri dish with a gradient of temperatures but no food. Those trained to prefer a cooler temperature will move to the cooler side of the dish, while the worms trained to prefer a warmer temperature go to the warmer side.

But what exactly do these result mean? Some believe that the worms crawl toward a particular temperature in expectation of food. Others argue that the worms simply become habituated to that temperature, so they prefer to hang out there even without a food reward.

The puzzle could not be resolved due to a major limitation of many of these experiments — the lengthy amount of time it takes for a worm to traverse a nine-centimeter petri dish in search of the preferred temperature.

Measuring how learning changes over time

Nemenman and Ryu sought to overcome this limitation. They wanted to develop a practical way to precisely measure the dynamics of learning, or how learning changes over time.

Ryu's lab used a microfluidic device to shrink the experimental model of nine-centimeter petri dishes into four-millimeter droplets. The researchers could rapidly run experiments on hundreds of worms, each worm encased within its individual droplet.

"We could observe in real time how a worm moved across a linear gradient of temperatures," Ryu says. "Instead of waiting for it to crawl for 30 minutes or an hour, we could much more quickly see which side of the droplet, the cold side or the warm side, that the worm preferred. And we could also follow how its preferences changed with time."

Their experiments confirmed that if a worm is trained to associate food with a cooler temperature it will move to the cooler side of the droplet. Over time, however, with no food present, this memory preference seemingly decays.

"We found that suddenly the worms wanted to spend more time on the warm side of the droplet," Ryu says. "That's surprising because why would the worms develop a different preference and even avoidance of the temperature they had come to associate with food?"

Eventually the worm begins moving back and forth between the cooler and warmer temperatures.

The researchers hypothesized that the worm does not simply forget the positive memory of food associated with cooler temperatures but instead starts to negatively associate the cooler side with no food. That spurs it to head for the warmer side. Then as more time passes, it begins to form a negative association of no food with the warmer temperature, which combined with the residual positive association to the cold, makes it migrate back to the cooler one.

"The worm is always learning, all the time," Ryu explains. "There is an interplay between the drive of a positive association and a negative association that causes it to start oscillating between cold and warm."

'It's like when you lose your keys'

Nemenman's team developed theoretical equations to describe the interactions over time between the two independent variables — the positive, or excitatory, association that drives a worm toward one temperature and the negative, or inhibitory, association that drives it away from that temperature.

"The side that the worm gravitates toward depends on when exactly you take the measurements," Nemenman explains. "It's like when you lose your keys you may check the desk where you usually keep them first. If you don't see them there right away, you run around different places looking for them. If you still don't find them, you go back to the original desk figuring you just didn't look hard enough."

The researchers repeated the experiments under different conditions. They trained the worms at different starting temperatures and starved them for different durations of time before testing their temperature preference, and the worms' behaviors were correctly predicted by the equations.

They also tested their hypothesis by genetically modifying the worms, knocking out the insulin-like signaling pathway known to serve as a negative association pathway.

"We perturbed the biology in specific ways and when we ran the experiments, the worm's behavior changed as predicted by our theoretical model," Nemenman says. "That gives us more confidence that the model reflects the underlying biology of learning, at least in C. elegans."

The researchers hope that others will test their model in studies of larger animals across species.

"Our model provides an alternative quantitative model of learning that is multi-dimensional," Ryu says. "It explains results that are difficult, or in some cases impossible, for other theories of classical conditioning to explain."

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.

The conditioned reflex

Teasing out the puzzle

Measuring how learning changes over time

'It's like when you lose your keys'

You might also like