David Luan: Why Nvidia Will Enter the Model Space & Models Will Enter the Chip Space | E1169
25 Jun 2024 (3 months ago)
- OpenAI realized that the next phase of AI after Transformer models would involve solving major unsolved scientific problems.
- There are still ways to improve model performance, which will require a significant amount of compute.
- David Luan is excited to be on the podcast and has watched previous episodes.
- David Luan has worked at several incredible companies, which have served as training grounds for him.
- Google Brain was dominant in AI research during the 2012-2018 period.
- Google Brain's success was due to its ability to attract top talent and provide an environment that encouraged pure bottom-up basic research.
- Pure bottom-up basic research involves hiring brilliant scientists, giving them the freedom to work on open technical problems in AI, and allowing them to collaborate and share ideas.
- This approach led to significant breakthroughs, such as the invention of the Transformer model.
- The Transformer model, invented at Google Brain, was a general-purpose model that could be applied to a wide range of machine learning tasks.
- Prior to the Transformer, different models were required for different tasks, such as image understanding, text generation, and playing games.
- The Transformer's versatility made it the foundation of modern AI and reduced the need for low-level breakthroughs in modeling.
- Researchers could now focus on using the Transformer to solve complex real-world problems.
- The success of the show is attributed to the willingness to ask simple questions.
- ChatGPT is seen as a consumer breakthrough after the introduction of Transformers in 2017.
- The gap between the Transformer breakthrough and consumer adoption of ChatGPT is questioned.
- Language models have been gradually improving since the introduction of Transformers in 2017.
- GPT-2, released in 2019, demonstrated impressive generalist capabilities.
- Two key factors contributed to the viral success of ChatGPT:
- Increasing intelligence of language models reaching a compelling level.
- User-friendly packaging that allowed consumers to interact with the models.
- The GPT-3 API was available over a year before ChatGPT, but it lacked the consumer-friendly interface, hindering its widespread adoption.
- OpenAI realized that the next phase of AI after Transformer was not about research paper writing but about solving major unsolved scientific problems.
- Instead of loose collections of researchers, OpenAI focused on building a culture around large groups of scientists working on specific real-world problems.
- This approach is different from the traditional academic curiosity-driven research and is similar to the Apollo project, where the goal is to solve a specific problem rather than just conducting research.
- Historically, scaling up language models has shown diminishing returns for every incremental GPU added.
- However, every doubling of GPUs has resulted in very predictable and consistent returns.
- To make a base language model predictably and consistently smarter, the amount of compute needs to be doubled.
- Model scaling involves enlarging the model and having it gather data to enhance its intelligence.
- The focus is shifting from base model scaling to more efficient methods like simulation, synthetic data, and reinforcement learning.
- Models trained on data have limitations in discovering new knowledge or solving problems not in the training set.
- Chatbots and agents are diverging into distinct technologies with varying use cases and requirements.
- Hallucinations in chatbots and image generators can be beneficial for problem-solving, novelty, and creativity.
- Agents performing tasks like handling taxes or shipping containers should not hallucinate to avoid errors or problems.
- AI systems have become less predictable as they grow larger and more complex.
- The capabilities of a model cannot be fully predicted in advance, even with estimates.
- Minimum viable capabilities are a function of model scale, meaning certain capabilities emerge at specific model sizes.
- It is difficult to predict the exact resource investment needed to achieve specific capabilities.
- Reasoning is a challenging problem in AI that requires new research.
- Pure model scaling alone does not provide solutions to reasoning.
- Reasoning involves composing existing thoughts to discover new ones, which cannot be achieved solely through internet data regurgitation.
- Solving reasoning requires providing models with access to theorem-proving environments, similar to how human mathematicians approach problem-solving.
- The general capability of reasoning needs to be solved at the model provider level to improve the model's ability to reason.
- The high costs involved in reasoning skills limit the number of long-term steady-state LLM providers to a maximum of 5-7.
- Solving reasoning involves training a base model, granting it access to various environments to solve complex problems, and incorporating human input.
- Memory in AI presents challenges in both short-term working memory and long-term memory, with progress made in the former but not the latter.
- Application developers should integrate long-term memory about user preferences as part of a larger system.
- LLMs are components of a larger software system, and the winning LLM providers will be those that existentially need to win, such as tier-one cloud providers.
- Nvidia should expand its chip business by offering LLM services, while major LLM providers are developing in-house chips for better margins.
- The interface of the LLM provides significant leverage downstream, leading to expected vertical integration pressure between model builders and chip makers.
- Chip makers need to own something at the model layer to avoid commoditization.
- Apple has an advantage in running smart models for free at the edge but may not be enough to compete with the smartest models.
- Apple will excel in private, fine-tuned models that don't require extensive reasoning and will run on the edge.
- OpenAI's GPT-40 represents a scientific advancement towards universal models that can process various inputs and generate any output.
- Apple's partnership with OpenAI suggests recognition of OpenAI's progress and hints at a future where foundational models become commoditized.
- Tier-one cloud providers will invest heavily in foundational models, while independent companies selling models to developers must either become first-party efforts of big clouds or build a substantial economic flywheel before commoditization.
- Adept's focus on selling end-user-facing agents to enterprises differentiates its business model from companies selling models to developers.
- OpenAI's Chat GPT provides a significant advantage in building an economic flywheel for independent foundational model companies.
- Adept is focused on building an AI agent that can handle arbitrary work tasks.
- Adept is not trying to train foundation models to sell to others, but rather building a vertically integrated stack from the user interface to the foundation modeling layer.
- Adept believes that vertical integration of model with use case is the only way to solve the generalization problem and handle the variability of agent requirements across industries.
- Adept's advantage is that it is building a system that can handle any workflow or task in any enterprise, rather than focusing on a particular narrow problem.
- Every enterprise workflow is an edge case, and controlling the entire stack is necessary to handle the variability and complexity of these workflows.
- RPA is suitable for repetitive tasks, while agents require constant thinking and planning to solve goals.
- Agents disrupt the business model of RPA players by demanding self-serve capabilities and addressing challenging use cases.
- The pricing model may shift from per-seat to consumption-based in some areas, but valuable knowledge work will not be priced that way due to AI agents empowering people to explore new possibilities and enhance their productivity.
- The objective is to develop AI systems that act as co-pilots or teammates.
- Unlike traditional workers, these AI systems should not charge based on their workload but rather on their ability to augment human capabilities and create new opportunities.
- Co-pilots can be a great incumbent strategy for companies to morph their existing software business model into AI.
- AI is not going to take all jobs, but it will augment human capabilities and make them more efficient.
- Co-pilot style approach is necessary for humans to effectively use AI systems.
- Collapsing the talent stack, where individuals have multiple skill sets and can simultaneously fulfill different roles, will make projects and teams more effective.
- Humans at work will become more like generalists, with larger scope over various functions, while supervising specialized AI co-pilots.
- Most Enterprise AI adoption is still in the experimental budget phase.
- Many Enterprises still have a lot of on-prem infrastructure and workflows running on mainframes.
- Cloud technology, which is considered mature from a startup perspective, still doesn't have full adoption in Enterprises.
- There is a concern that AI adoption may follow a similar pattern to autonomous driving, with a period of plateauing progress.
- Unlike self-driving cars, where there was an initial breakthrough followed by incremental improvements, AI models are constantly improving with new scientific advancements.
- Breakthroughs like Universal multim lt40 are visible and continue to enhance the capabilities of AI models.
- AI models are already deployed and don't need to reach a specific level of reliability before deployment.
- AI services companies, which help in the implementation of AI in large Enterprises, are projected to be bigger than the model providers themselves in terms of revenue.
- This projection has been proven true, with some people who criticized the prediction later acknowledging their mistake when the revenue numbers were revealed.
- AI Services providers may not be the biggest players in the long term.
- Companies that turn use cases with product-market fit into repeatable products will likely be the real economic winners.
- Many AI services will eventually be turned into generalizable products.
- There is a concern that regulations on data usage and collection could hinder the progression of AI models.
- Regulatory capture is a concern, with lawmakers potentially favoring established companies and making it harder for new entrants to compete.
- This could lead to a concentration of power in the AI industry and stifle innovation.
- Open AI systems often lag behind closed AI systems due to fewer resources and incentives, but they are crucial for the field to keep up with major players.
- The final step to achieving Artificial General Intelligence (AGI) involves finding the optimal interface between AI systems and humans.
- The current sequential approach to AI technology development is not ideal. Instead, we should consider how humans will use AI systems and design the entire solution end-to-end.
- More attention should be given to developing effective ways for humans to interact with, supervise, and correct AI systems as they become more intelligent.
- Human interaction with AI systems is richer and more effective than writing instructions, especially as systems become more intelligent.
- David Luan thinks agents and chatbots will diverge into two distinct products: one for practical tasks and the other for therapeutic or entertainment purposes.
- The biggest misconception about AI in the next decade is that it will fully automate every human capability. Instead, AI will likely serve as a tool to enhance human intelligence.
- In five years, agents could become like a non-invasive brain-computer interface, allowing humans to think and reason at a higher level.
- One reason why this vision of agents might not happen is the resistance from incumbent software companies that bundle software into specific functional areas, preventing agents from bridging different domains.
- Luan believes that the potential market for agents is much larger than that of robotic process automation (RPA) because agents can address a wider range of tasks.