Let's talk about AI and coding this week. Since ChatGPT started on the hype train earlier this year, I've closely watched its progress and evaluated its use cases.
This week, I came across an interesting article referencing a Stanford University study about how ChatGPT's behavior has changed over time. Interestingly, they found that "their (GPT 3 & 4) performance on some tasks have gotten substantially worse over time".
While I'm interested in all the things AI can accomplish, my focus is mostly on the code generation capabilities since they impact my learning community and me the most. While I've stated before that I don't find LeetCode to be indicative of new-hire performance, I was surprised that when the researchers tested GPT 4 with 50 problems categorized as "easy" by LeetCode, they found the percentage of acceptable code (passing the tests) dropped from 52% in March to 10% in June.
Before celebrating the death of LLMs, we need to note that the acceptable code level dropped because the LLM didn't follow the instructions and added extra commentary and markup to the generated code. When this was scrubbed out, the performance increased by 18% from March to June.
This is something I have noticed when working with LLMs over time. They are becoming more verbose and less likely to follow detailed instructions, which reduces their usefulness. I have also noticed that for some tasks, like generating code to perform a sort, it sometimes will not write the sort code I asked for, instead generating a different type of sort. The code "works", but it's not what I asked for, so you must be a skilled developer to use LLMs effectively.
As I've explored LLMs, they do a fairly poor job with the more advanced critical thinking skills involved in IT. When asking ChatGPT about topics like software architecture, performance tuning, and other senior/architectural concerns, its responses ranged from correct but superficial to completely wrong (and sometimes wrong in a way that would cause serious problems in the software stack).
I'm not the only one who has noticed this, as evidenced in a report by Immunefi, a web security company. Immunefi found that about 64% of respondents said ChatGPT provided "limited accuracy" in identifying security vulnerabilities, and approximately 61% said it lacked the specialized knowledge for identifying exploits that hackers can abuse.
ChatGPT and LLMs are changing the way I work, and they are going to impact jobs, though I'm very skeptical of doom and gloom scenarios for anyone except spam and misinformation peddlers who are going to be able to crank out more content than ever.
I certainly don't trust it to assist job seekers on LinkedIn.
First, a heartfelt thank you for your amazing support throughout this past year as we've continued to grow our content library and evolve Skill Foundry. Your feedback and engagement have been instrumental in shaping where we're headed next! I'm excited to announce a few significant changes that we have launched this week that will better serve your learning journey: Python for Beginners is Live! Python is a beginner friendly programming language that is popular for development, automation,...
On Human Value, AI, and Spam Finally, companies are taking steps to curb AI spam Ever since tools like Chat-GPT hit the scene, I have been sounding the alarm about the incoming flood of misinformation, spam, and clickbait content. It took a while for platforms and users to catch up, but we’re finally seeing a backlash against low-quality, AI-generated content. In March, Google announced that it is making changes to combat AI spam in search. They understand the threat to their business. People...
Advancing your Tech Career in the AI Era In the past year, I've seen an increasing number of people on social media fretting about the future of tech jobs in an AI world and whether it's even worth it to go into the field. The sensationalist, click-bait pieces from media and influencers haven't helped matters, so I figure it's time to take a few minutes and help people take a deep breath, calm down, and have a rational discussion about what is happening today and my prediction for what is...