Thomas Wolf

thomaswolfcontact [at] gmail [dot] com
I'm a co-founder of Hugging Face where I oversee the open-source team and the science teams.
I enjoy creating open-source software that make complex research accessible (I'm most proud of creating the Transformers and Datasets libraries as well as the Magic-Sand tool). When I'm not building OSS, I push for open-science in research in AI/ML, trying to lower the gap between academia and industrial labs by imagining projects like the BigScience Workshop on Large Language Models. My current research interests are centered around overcoming the current limitations of Large Langugage Models with multi-modalities and complementary approaches. Finally, I also enjoy writing and filming educational content on ML and NLP, including writing the reference book Natural Language Processing with Transformers published by O'Reilly and written with my amazing co-authors Lewis Tunstall and Leandro von Werra, writing in my medium blog and recording out-of-the-ordinary videos like The Future of Natural Language Processing.

Short bio
I’ve been programming since forever, writing video games and software in Assembly and C/C++, but my first career was actually in Physics rather than Computer Science.
After graduating from Ecole Polytechnique (Paris, France), I worked on laser-plasma interactions at the BELLA Center of the Lawrence Berkeley National Laboratory (Berkeley, CA). Got accepted for a Ph.D. at MIT (Cambridge, MA) in the USA but ended up doing my Ph.D. in Statistical/Quantum physics at Sorbonne University and ESPCI (Paris, France), working on superconducting materials for the DGA(French DARPA) and Thales. After my PhD, I needed a change from the long time scale of experiments in physics and ended up totally changing direction. I joined an IP Law firm, Cabinet Plasseraud (Paris, France), got a law degree from Pantheon Sorbonne University and worked as a Patent Attorney for 5 years, assisting a portfolio of startups and big companies to build and defend their Intellectual Property assets.
In 2015, I was consulting for many Deep-Learning/AI/ML startups and they made me discover the maths behind the new ML/AI revolution. I realised that most of these methods, equations and tools were just re-branded statistical physics approaches which fueled my interest for Machine Learning and Deep Learning. I started my online education in AI/ML reading books and following online courses. About year later, one of my friend asked me if I wanted to start something crazy ambitious with Hugging Face, and there I was, doing science and coding again and having a lot of fun!

Publications
My full publication list can be found on my Google Scholar page. A couple of notable ones are:

Open source
Most of my open-source work can be found on the Hugging Face github repository. A couple of notable library I created are:

Blog
I like to explain clearly what I have learned and this has lead to a few blog posts that were quite interesting to other as well I guess (they totalise over a quarter million views at the end of 2018). I will try to continue writing things like that when I find the time. I used to be a teacher during my PhD and I do miss teaching. Blogging is my substitute. A couple of notable posts:

Invited Talks and News
Copyright Thomas Wolf 2017-2022