Thomas Wolf

thomaswolfcontact [at] gmail [dot] com
Thomas Wolf is a co-founder of Hugging Face where he has been at the inception of the open-source, educational and research efforts.
Thomas enjoys creating open-source software that make complex research, models and datasets widely accessible (for instance by creating the Hugging Face Transformers and Datasets libraries). When he's not building OSS libraries, he can be found pushing for open-science in research in AI/ML, trying to lower the gap between academia and industrial labs through projects like the BigScience Workshop on Large Language Models (LLM) which lead to the BLOOM experiments, model and dataset. His current research interests circle around LLM accessibility as well as measuring and overcoming present limitations of Large Language Models. He also enjoys writing and filming educational content on AI, ML and NLP, including writing the reference book "Natural Language Processing with Transformers" published at O'Reilly with amazing co-authors, writing (not often enough) in his blog and recording (also not often enough) educational videos like The Future of Natural Language Processing.

Short bio
I’ve been programming since forever, writing video games and software in Assembly and C/C++, but my first career was actually in Physics rather than Computer Science.
After graduating from Ecole Polytechnique (Paris, France), I worked on laser-plasma interactions at the BELLA Center of the Lawrence Berkeley National Laboratory (Berkeley, CA). Got accepted for a Ph.D. at MIT (Cambridge, MA) in the USA but ended up doing my Ph.D. in Statistical/Quantum physics at Sorbonne University and ESPCI (Paris, France), working on superconducting materials for the DGA(French DARPA) and Thales. After my PhD, I needed a change from the long time scale of experiments in physics and ended up totally changing direction. I joined an IP Law firm, Cabinet Plasseraud (Paris, France), got a law degree from Pantheon Sorbonne University and worked as a Patent Attorney for 5 years, assisting a portfolio of startups and big companies to build and defend their Intellectual Property assets.
In 2015, I was consulting for many Deep-Learning/AI/ML startups and they made me discover the maths behind the new ML/AI revolution. I realised that most of these methods, equations and tools were just re-branded statistical physics approaches which fueled my interest for Machine Learning and Deep Learning. I started my online education in AI/ML reading books and following online courses. About year later, one of my friend asked me if I wanted to start something crazy ambitious with Hugging Face, and there I was, doing science and coding again and having a lot of fun!

Publications
My full publication list can be found on my Google Scholar page. A couple of notable ones are:

Open source
Most of my open-source work can be found on the Hugging Face github repository. A couple of notable library I created are:

Blog
I like to explain clearly what I have learned and this has lead to a few blog posts that were quite interesting to other as well I guess (they totalise over a quarter million views at the end of 2018). I will try to continue writing things like that when I find the time. I used to be a teacher during my PhD and I do miss teaching. Blogging is my substitute. A couple of notable posts:

Invited Talks and News
Copyright Thomas Wolf 2017-2022