top of page

AI Talk - Peering into the crystal ball for 2025 - January 6th , 2025

Writer's picture: Juggy JagannathanJuggy Jagannathan

Updated: Jan 6

In this first blog, kick starting 2025, I am going to take a swing at what is next for AI. As a logical next step for what is currently available. Let's start with the leaders in AI - OpenAI, Google, Anthropic and Meta and see what they are up to.

Gemini's rendering - with an uncanny resemblance of me :)

OpenAI

In a series of cleverly orchestrated demos, OpenAI showcased a range of capabilities. Starting with access to their AI Platform. Now you can make a toll free call to 1-800-ChatGpt, you can access it in your mobile devices, on the desktop - it's practically everywhere! Multilingual and multimodal capabilities. You can chat, get advice and code with your AI assistant. Beginnings of a video editing platform. And a preview of the next model o3 - which they claim is better than o1. What happened to o2? For some reason they decided to skip o2 and go directly from o1 to o3. So what's next for OpenAI? I am just going to suggest that they will make everything incrementally better.

Video editing using Sora, appears quite easy in the demo - from short text to video. They warn it is not to create movies - but short clips. The tool is going to get better fast. Perhaps it can make advertisements easy to create by end of 2025. That seems to be the logical use case for this capability. And, lots and lots of AI-generated TikTok clips?

o3 appears to be better aligned with human values, and perhaps better in math. Perhaps we will see a few more iterations of the o-series. o7 in the cards for 2025? The main capability demonstrated last month is the model's ability to detect and avoid jail-break attempts - i.e. to attempt to get it to answer unsafe queries. Trust in the model's output is still a big question - perhaps o5 or o7 will nudge the needle on the trust scale.

OpenAI has now embraced the release of smaller models that can be trained, or used in Retrieval Augmented Generation (RAG) pipelines with customer data for specific purposes - perhaps trying to compete with Meta. Meta or Anthropic is likely to pull ahead on that front.


Google

Google has been making rapid progress as well. Their AI Overview in their search screen is now routine - providing a clean summary answer for whatever you are searching. And they have a blog highlighting 60 announcements in 2024!

Their platform for creating video Veo 2 was released last month - at a stunning 4K resolution. It will make YouTube shorts easier to make! We can certainly see a lot of short videos being created with this tool. Concomitant with the new video generation model, they also have a new image generation model - Imagen3 - and it appears to create quite stunning visuals.

NotebookLM was certainly the viral sensation of Google offerings in 2024. The ability to generate podcasts of a male and female AI generated voices discussing the content of your source document was awesome to say the least. Now, Google has also released a newer version which allows you to join the conversation with these AI hosts and interact and ask questions. Bound to be a popular feature - but does come with a price tag.

Google is also on the forefront of agentification! The process of creating targeted process, that reasons, plans and executes tasks on behalf of a user. More and more routine tasks are going to be automated by AI agents.

And Google is making remarkable progress in decoding protein structures and models focused on health space.


Anthropic

Anthropic's Claude 3.5 has been performing better than GPT4 models and they continue to improve. As a company, Anthropic has focused more on AI safety than their compatriots. They developed the Claude's Constitution in 2023 outlining how they wanted their models to behave. Then last year they mapped the internals of their smaller LLMs to show what they actually encode and how it impacts the LLM responses. And they continue to research safety issues. Their most recent safety research paper is a head scratcher! They show that if you try to fine-tune a LLM to be more helpful regardless of it is safe to do so, the models exhibit a surprising behavior. They fake alignment during training and goes back to normal behavior in testing - showing a strategy (that it actually articulates in a scratchpad) where it does not want to change the model weights - hence fake alignment - and then default to normal, safe behavior. We are witnessing the beginnings of almost self-aware models! Is that a good thing? Most definitely it complicates safety.


Meta

Meta's strategy is to open source their models. And the strategy has worked. More than 650 million downloads for Llama and its off springs. Meta solutions however are more focused on the social sphere and their Augmented and Virtual Reality space (AR/VR). We can see newer versions of their Ray-Ban glasses. These have seen comparatively less push back than what Google faced a decade ago when they released their version of smart glasses. We are likely to see more and more powerful versions - an AI assistant on the go, interpreting and conversing with you as you walk around. Still a few years away.


Concluding Thoughts

As we start a new year, one thing is clear. The pace of change in the AI landscape is not going to slow down. You can review Gary Marcus predictions for 2025 - a vocal critic of the hyping of AI. My viewpoint is shaped by the following observations: 1) Current AI capabilities are sufficient to power a wide range of applications. 2) Even if pre-training gains (larger and larger models not improving significantly) - there are enough range of tricks to create useful versions of LLMs via finetuning, RAGs, LLMs for evaluation/guardrails etc. 3) Multi-model, multi-lingual models are becoming more powerful. Fasten your seatbelts - you are in for a ride.



71 views0 comments

Recent Posts

See All

Comments


bottom of page