In which we answer this question with a no,
but provide a running start on an alternative.
PDF
Paperback
Recap
- We're looking for a way to train a robot with verbal commands and rewards the way we would train a dog. The current collection of popular machine learning methods can't do this yet.
- .
- Unsupervised learning finds groups and sequences, but isn't suited to learning the logic of taking actions. Supervised learning can learn to classify situations according to the action they call for, but does so slowly and clumsily.
- .
- Reinforcement learning is a good model for training a robot. It handles sensors, actions, and rewards just the way we want. However, existing methods need too many rewarded examples to be practical.
- .
- Human directed reinforcement learning is what we've named a variant of reinforcement learning where a human is manually providing all the rewards. They may also be providing cues or interacting with the robot.
- .
- If successful, HDRL would open up a lot of possibilities for using robots and automation to get things done.
Resource links
-
1. Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An
Introduction, Second Edition. MIT Press, Cambridge, MA, 2018
Website: http://incompleteideas.net/book/the-book-2nd.html
Full PDF: http://incompleteideas.net/book/RLbook2020.pdf -
2. Laura Smith, Ilya Kostrikov, Sergey Levine. "A Walk in the Park: Learning to
Walk in 20 Minutes With Model-Free Reinforcement Learning"
arXiv:2208.07860v1 [cs.RO] 16 Aug 2022
Blog: https://sites.google.com/berkeley.edu/walk-in-the-park
Paper: https://arxiv.org/abs/2208.07860 -
3. Alex Irpan. "Deep Reinforcement Learning Doesn't Work Yet" 2018
Blog: https://www.alexirpan.com/2018/02/14/rl-hard.html -
4. Allison Marie Horst, Alison Presmanes Hill, Kristen B Gorman.
"palmerpenguins: Palmer Archipelago (Antarctica) penguin data," R package
version 0.1.0. doi:10.5281/zenodo.3960218 2020
Blog: https://allisonhorst.github.io/palmerpenguins/ -
5. Curtis G. Northcutt, Anish Athalye, Jonas Mueller. "Pervasive Label Errors in
Test Sets Destabilize Machine Learning Benchmarks" Proceedings of the 35th
Conference on Neural Information Processing Systems Track on Datasets and
Benchmarks. Dec 2021
Website of examples: https://labelerrors.com/
Paper: https://arxiv.org/pdf/2103.14749.pdf -
6. Boston Dynamics
Website: https://www.bostondynamics.com/