#MyStrategyEvolution We have discovered that evolution strategies (ES), an optimization technique known for decades, rival the performance of standard reinforcement learning (RL) techniques on modern RL benchmarks (e.g., Atari/MuJoCo), while overcoming many of the drawbacks of RL.

In particular, ES is simpler to implement (no need for backpropagation), is easier to scale in a distributed environment, does not suffer in environments with sparse rewards, and has fewer hyperparameters. This result is surprising because ES resembles simple hill climbing in a high-dimensional space based solely on finite differences along a few random directions at each step.

Our finding continues the modern trend of achieving solid results with ideas from decades ago. For example, in 2012, the paper "AlexNet" demonstrated how to design, scale, and train convolutional neural networks (CNNs) to achieve exceptionally solid results on image recognition tasks, at a time when most researchers thought CNNs were not a promising approach for computer vision. Similarly, in 2013, the deep learning Q paper showed how to combine Q-Learning with CNNs to successfully solve Atari games, revitalizing machine learning (RL) as a research field with exciting experimental results (rather than theoretical ones). Likewise, our work demonstrates that ES achieves excellent performance on RL benchmarks, dispelling the common belief that ES methods are impossible to apply to high-dimensional problems.