An Epsilon-Greedy Analogy for Producing Work

December 2023.



In his essay "An Opinionated Guide to ML Research", John Schulman proposes that epsilon-greedy exploration could potentially help you maximize both depth and novelty in producing research output: by dedicating most of your time (\(1-epsilon\)) to a pre-defined area of focus and a small amount of time (\(epsilon\)) to exploration, you could be able to deepen your work while increasing the chance to develop complementary work that can further reinforce your research.

I think we can make a similar comparison to quantity vs quality: what percentage of my time should I spend perfecting a pre-defined task, and what remaining percentage of my time should I dedicate to producing new work on-the-fly?

There are a couple of interesting cases that I would like to make before giving my direct take on this question.

In James Clear's book Atomic Habits, he refers to a study where the instructor of a photography class divided the students in two groups: the first group was evaluated on the quality of the photos taken, and the second group was evaluated on the number of photos taken. Oddly enough, by the end of the semester, the second group that was evaluated on quantity ended up producing much better quality photos than the other group, as the "quantity group" was encouraged to experiment and try different things, in order to discard ideas that wouldn't work, and to refine a personal character and taste to photography. (I feel like the latter half of this paragraph is me extrapolating from the initial case presented by James Clear; I'd highly recommend reading it to get a more objective sense of the study, but also to get to read a great book.)

Anthony Hopkins is a great, renowned actor. He is also known for only doing one take for each scene, where I believe Hopkins gives his all.

These are only two examples out of many, many of which may be opposite (i.e. director David Fincher recorded a high number of takes for certain scenes of The Social Network, a nominee for the 2010 Oscars' Best Picture).

However, there seems to be some evidence suggesting that a quantity-first approach (trying different things) does lead to personal growth as opposed to a quality-first approach (perfecting a single problem through many takes or iterations). Thus, the \(epsilon\) \(\epsilon\) we would use in a quantity-first approach would be very large (i.e. \(\epsilon=0.99\)), where most of our time is spent to discovery.

Another way of seeing this is that every moment comes and goes so fast. Finding the perfect time to write an essay, or a computer program, may take time. And by the time you find that perfect moment, you might not have that same drive or internal state of mind that made you want to write (or create something) in the first place.

There are likely many different and valid approaches to producing work on the quantity vs quality topic; though I think that when working on things for artistic or educational pursuits (whose existence does not endanger nor harm anyone), a quantity-first approach (\(\epsilon=0.99\)) is an attractive one.