Thomas Wolf

thomaswolfcontact [at] gmail [dot] com
Thomas Wolf is co-founder and Chief Science Officer (CSO) of Hugging Face where he has been at the inception of the open-source, educational and research efforts.
Thomas enjoys creating open-source software that make complex research, models and datasets widely accessible (for instance by creating the Hugging Face Transformers and Datasets libraries). When he's not building OSS libraries, he can be found pushing for open-science in research in AI/ML, trying to lower the gap between academia and industrial labs through projects like the BigScience Workshop on Large Language Models (LLM) which lead to the BLOOM experiments, model and dataset. His current research interests circle around LLM accessibility as well as measuring and overcoming present limitations of Large Language Models. He also enjoys writing and filming educational content on AI, ML and NLP, including writing the reference book "Natural Language Processing with Transformers" published at O'Reilly with amazing co-authors, writing (not often enough) in his blog and recording (also not often enough) educational videos like The Future of Natural Language Processing.

Short bio
I’ve been programming since forever, writing video games and software in Assembly and C/C++, but my first career was actually in Physics rather than Computer Science.
After graduating from Ecole Polytechnique (Paris, France), I worked on laser-plasma interactions at the BELLA Center of the Lawrence Berkeley National Laboratory (Berkeley, CA). Got accepted for a Ph.D. at MIT (Cambridge, MA) in the USA but ended up doing my Ph.D. in Statistical/Quantum physics at Sorbonne University and ESPCI (Paris, France), working on superconducting materials for the DGA(French DARPA) and Thales. After my PhD, I needed a change from the long time scale of experiments in physics and ended up totally changing direction. I joined an IP Law firm, Cabinet Plasseraud (Paris, France), got a law degree from Pantheon Sorbonne University and worked as a Patent Attorney for 5 years, assisting a portfolio of startups and big companies to build and defend their Intellectual Property assets.
In 2015, I was consulting for many Deep-Learning/AI/ML startups and they made me discover the maths behind the new ML/AI revolution. I realised that most of these methods, equations and tools were just re-branded statistical physics approaches which fueled my interest for Machine Learning and Deep Learning. I started my online education in AI/ML reading books and following online courses. About year later, one of my friend asked me if I wanted to start something crazy ambitious with Hugging Face, and there I was, doing science and coding again and having a lot of fun!

Blog
I like to explain what I have learned and this has lead to a few blog posts that were quite interesting to other as well I guess (they totalise over a quarter million views at the end of 2018). I will try to continue writing things like that when I find the time. I used to be a teacher during my PhD and I do miss teaching. Blogging is my substitute. A couple of notable posts:
  • 🐳 Some notes on "DeepSeek and export control" Finally took time to go over Dario's essay on DeepSeek and export control and wrote some notes. I mostly disagree and I think it missed the point.
  • 💥 Training Neural Nets on Larger Batches: Practical Tips for 1-GPU, Multi-GPU & Distributed setups I've spent most of 2018 training models that could barely fit 1-4 samples/GPU. But SGD usually needs more than few samples/batch for decent results. I wrote a post gathering practical tips I use, from simple tricks to multi-GPU code & distributed setups
  • ⛵ Learning Meaning in Natural Language Processing — The Semantics Mega-Thread A summary, overview and map of a huge discussion on learning meaning in NLP that happened on Twitter in August 2018 with more than a 100 comments and great inputs from Matt Gardner, Yoav Goldberg, Sam Bowman, Emily M. Bender, Graham Neubig, Jeremy Howard, Tal Linzen, Jacob Andreas, Ryan D. Cotterell ...
  • 🚀 100 Times Faster Natural Language Processing in Python How you can make your Python NLP module 50-100 times faster by use spaCy's internals and a bit of Cython magic! Womes with a Jupyter notebook with examples processing over 80 millions words per sec.
  • ...

  • Publications
    My full publication list can be found on my Google Scholar page. A couple of notable ones are:

    Open source
    Most of my open-source work can be found on the Hugging Face github repository. A couple of notable library I created are:

    Invited Talks and News
    Copyright Thomas Wolf 2017-2025