Math -> GPU
I was a grad student in math between 2014 - 2018. I now do GPU programming for ML.
The transition I found really difficult. But having made it, I find it is a good place for a math person in industry because
- It’s a newer kind of programming: very few people have been doing it for a long time & that levels the playing field
- Getting better performance at the systems level rewards detailed, careful reasoning, which can be a comparative advantage for math folks
- Reflecting that detail, a lot of the relevant manuals (such as the Cuda C++ Programming Guide & the C++ Standard) read like grad math textbooks, and it scares some people off.
One might wonder if by doing GPU programming for ML you’re hitching your cart to the latest fad. But the hope is that by placing yourself low in the stack, your value is more likely to persist and generalize. Instead of thinking of yourself as a “programmer for generative AI”, you can think of yourself as “a programmer of massively parallel, differentiable programs, specializing in algorithms of high arithmetic intensity, relative to their expressiveness”. This is what modern neural networks are particular instances of. Highly parallel, differentiable programming isn’t going anywhere, even once the current version of generative AI is superceded by the next.
Similarly, the billions of dollars companies are currently investing in data centers are not bets on current generative AI specifically, but bets on highly parallel programming.
If anyone is interested in guidance or mentorship in this area, reach out. EricAuld@gmail.com