What should I learn if I want to stay competitive in the job market in a world where AI is automating everything? Are there any projects I could build that would make me more competitive? (I prefer building stuff instead of aimlessly learning random topics.) I was trying to understand the transformer architecture, self-attention, and reading AI papers, but I feel like none of this sh*t matters unless you work at OpenAI or something and have a giant cluster of GPUs with terabytes of training data. Without scale, anything one builds will be inferior. I could try to build products using the OpenAI API or open weight models (eg. Llama, Mistral), but feels like that's all just going to get commoditized as well in one giant race to the bottom, especially as incumbents like Google integrate AI into their products (calendar, email, etc.). What specifically should I be doing to stay on top of things in AI and remain competitive? Or is none of this worth my time if I'm not already some Stanford AI PhD gigachad?
learn pytorch or jax, both very similar plenty of tutorials available, including notebooks for running through by example you want a grasp on: common math and operators used: basic linear algebra, basic calculus, gradient descent and back propagation, linearization, loss functions like cross entropy loss, feed forward networks, transformers, embeddings. look up videos / tutorials for building an LLM from scratch. this stuff is really not complicated at all, just new and still a bit esoteric. modern ML is on the surface simpler than many other software domains. if you want to be indedpensible to someone, learn how to program GPUs/TPUs and learn about where the bottlenecks are there. most impactful work anywhere right now is making this stuff faster
Ok say I learn to build an LLM like GPT from scratch. Then what? I'm a bit doubtful that companies are going to find me valuable just because I went through Andrej Karpathy's Youtube videos. You said yourself that it's not that complicated. Learning how to program GPUs/TPUs sounds cool, but it sounds like something super niche that one can't meaningfully do unless one is already working in that industry. I'm looking for something concrete to work on or build that might make me attractive to these AI companies, but it seems like the bar for creating something impressive is so high that one practically already needs to be in the industry. But even if we ignore that and assume I'm super motivated with a ton of time on my hands, I'm not exactly sure where specifically to direct that attention. Also building sh*ttier LLM models than what's already available doesn't exactly motivate me or seem valuable in the market.
sounds like you aren't going to get very far
What exactly do you want? You’ll not be an AI researcher. Knowing the internals of AI is not relevant. Deploying AI models is just software engineering with some small tweaks. Maybe focus on that
What's the best way to learn this? I just want to be best positioned in a competitive job market where AI is automating a ton of work.
You just do it tbh. Make some lame llm wrapper and deploy the model yourself and face all the challenges that come with that
OP what do you actually do in your job? this sounds like asking how to learn excel to be competitive and the work of accountants, competitive excelers and the excel developers at ms are all very different
Full stack web development. I used to specialize in frontend (mistake in hindsight), but moved more to backend because I felt frontend was being commoditized. Now thinking of pivoting into AI/ML because I think all software development is getting commoditized, and also because I find dev kinda boring now and AI feels like a more interesting challenge. I'm unemployed at the moment, trying to break into either an AI company, or break into Big Tech with the idea of trying to pivot into an AI group or ML role. I was able to get to the final rounds at some companies like OpenAI for applied engineering roles which would've been ideal, but ultimately didn't get offers. Now waiting to hear back from Meta onsite, and in the process with another top AI company. If these fail, then need to just keep applying...
So you do want to go deeper into the subject and is applying for those places with 000s of GPUs, then it seems like reading more papers is the right call? You can build applications too, but they cant be random llm wrappers, Im thinking of stuff like deepspeed and bitsandbytes that require strong theorical knowledge. And tbh you may not find AI research that un-boring, we have 1 architecture that we have been using for nearly a decade and a handful of techniques that people slightly iterate on. Sometimes you get bigger innovations like RLHF or diffusion models, but often others are much more derivative and boring.
Learn about rag, ML Ops, model evaluation
I would say just learn Transformer. Everything else is outdated. I never learn about CNN, RNN, DNN that most courses teach. 2 years ago when I’m preparing for AIML I only studied about Transformer and was able to clear the interviews.
Just try to solve your problems. If AI helps solve it better, do it. Simple as that. Solve problems, don’t focus on using tech for the sake of using it.
What’s there to learn you just ask it things
What should I ask it?
Ask it how to learn about AI