When you go to a restaurant, do you always order the same thing or do you like to try something new?
If you order your favourite dish, you're guaranteed a delicious meal. If you order something you haven't tried before, you might discover a new favourite ... or be disappointed.
This is an example of the exploration-exploitation dilemma, which arises whenever our desire for information (in this case, about other dishes on the menu) conflicts with our need for reward (a satisfying meal).
This dilemma is a constant in problem-solving and decision-making. Whenever we make a decision, we face a dichotomous choice: exploit what has worked in the past or explore something else that might be even better.
A shifting equilibrium
"There's an inherent trade-off between exploration and exploitation, and the balance between the two does change over a person's lifetime," said Becket Ebitz, a professor in the Department of Neurosciences at Université de Montréal.
A recent literature review by Ebitz and his co-authors concluded that certain types of exploratory behaviour are common in preschoolers and then decrease with age.
"Our colleagues who work on child development have found that toddlers are very motivated to explore their environment; they are interested in everything and will try new things even when they're ineffective or unwise," said Ebitz. "As they get older, children move away from this type of exploring and focus more on exploiting the information they already have."
According to Ebitz, this explore-exploit continuum is very important in learning. It varies depending on the context: for example, in a recent study, he and his co-authors reported that healthcare workers exhibited lower rates of exploratory learning during the COVID-19 pandemic.
Lack of energy to explore
"When we are stressed, when the world around us becomes too unpredictable and overwhelming, we don't have the energy to explore and learn, so we stick to using the information we already have," he explained.
This tension ties into the stability-plasticity dilemma, the fact that the brain has the plasticity to acquire new knowledge and also the stability to remember it.
"At this point, we think we don't need plasticity and stability at the same time; we alternate between the two," said Ebitz. "There would appear to be a great deal of individual variation in this process, which could explain why some learners are more successful in certain environments than in others."
It's all in the eyes
In another recent study, Ebitz and his colleagues found that pupil size, a physiological sign of excitation controlled by the autonomic nervous system, is a predictor of the onset of exploration and the associated neural activity.
They found that processes linked to pupil dilation drive the brain's prefrontal cortex to a critical tipping point that permits exploratory decisions.
"It appears that a person's level of excitation - the way in which they are activated - can explain how close they are to initiating exploratory behaviour," explained Ebitz.
"In general, when we feel excited or stressed, we try to regulate our physical state to return to a neutral state, but this breakthrough suggests that yielding to the excitation could open up new possibilities."