The "Normsky" architecture for AI coding agents — with Beyang Liu + Steve Yegge of SourceGraph | Latent Space: The AI Engineer Podcast

This podcast episode features discussions on various topics related to AI coding agents and tools. The speakers introduce Sourcegraph, a coding intelligence startup, and its motivation for creating efficient code search and intelligence tools. They emphasize the importance of code search in enhancing productivity and navigating complex codebases. The episode provides insights into Steve's experiences at Google and Grab, highlighting the differences between the two companies and Grab's success as a super app. The speakers also discuss Cody, an AI coding assistant developed by Sourcegraph, and its features such as code generation and question answering capabilities. They emphasize the significance of context in AI coding agents and the benefits of using coding assistants in programming tasks. The episode explores topics like code completions, data models, code generation tools, the use of language models in code generation, and the importance of context and testing in AI development. It also delves into the concepts of the code graph, data pre-processing, data privacy, and the evolution of AI approaches. The speakers discuss the challenges faced in integrating code graphs with build systems and the potential future of AI-enhanced engineering. They explore the value of classic search techniques, the importance of fast inference and model fine-tuning, and the potential of scaling AI models. The episode concludes with discussions on the future of coding assistance, the evolving role of AI in understanding code, and the unsolved questions in AI development. Takeaways • The motivation behind starting Sourcegraph was the pain points experienced while dealing with large code bases and the need for efficient code search and intelligence. • Code search tools like Grok, Kyve, Hound, and Zookt are significant in enhancing coding productivity and navigating complex codebases. • Grab, a super app in Southeast Asia, serves as a inspiration for Sourcegraph's vision of creating a coding intelligence startup. • Cody, the AI coding agent developed by Sourcegraph, offers code generation and question answering capabilities and stands out for its high-quality context. • Context is crucial in AI coding agents, and Cody's fast, high-quality context sets it apart from other AI coding agents in the market. • Data models play a vital role in shaping the performance and effectiveness of code completion systems. • The transition from general-purpose requests to more targeted features in code generation tools is a focus for improvement. • Building best-in-class code generation models requires search-based approaches and a tree search approach, with language models serving as advisors. • The Norvig school and the Chomsky school represent two main schools of thought in AI approaches, with the Norvig school emphasizing learning from data and the Chomsky school emphasizing formal systems and precise constructs. • The hybrid approach of the "Normski" architecture incorporates both Norvig and Chomsky models, enabling comprehensive and valuable AI tools for developers. • Data pre-processing is crucial in enabling advanced applications and improving the performance of AI models, such as code generation or answer retrieval. • Ensuring data privacy and responsible practices in data pre-processing are paramount, with Sourcegraph guaranteeing the privacy of customer data. • The future possibility of programming without understanding domain-specific languages (DSLs) raises questions about the importance of DSLs in coding. • The integration of code graphs with build systems presents challenges, and innovative approaches like the Skip protocol aim to address these challenges. • The BFG (Big Friendly Graph) provides valuable context for AI coding tools and helps eliminate type errors in code completions. • The scaling of AI models presents opportunities for advancements in AI development, but it is essential to consider other algorithms and approaches in addition to scaling. • Classic search techniques still hold value and can be more effective than newer, trendy options in practical coding scenarios. • Fast inference and model fine-tuning are crucial in software development, and the expertise of the team at Sourcegraph allows for continuous improvement in these areas. • Sourcegraph aims to enable efficient development in complex code bases by improving codebases' understandability and manageability for tech leads and engineering leaders. • AI coding tools like Cody, with their ability to understand code at a code-based level, have the potential to transform coding processes and make coding more lovable. • The future of coding assistance lies in achieving reliable first try working code generation, using architecture and entity relationship diagrams, and developing more agentic and sophisticated coding assistants. • Leveraging different forms of intelligence and understanding the behavior of language models in various query conditions are key to fully harnessing the power of coding assistants.

Outlines

Sign in to continue reading, translating and more.

Continue

The "Normsky" architecture for AI coding agents — with Beyang Liu + Steve Yegge of SourceGraph

Latent Space: The AI Engineer Podcast

Introduction and Motivation

Steve's Experiences and Grab's Success

Introduction to Sourcegraph and Cody

Differentiating Cody from other AI Coding Agents

Cody's Approach to Code Completions with Context Fetching

The Importance of Data Models in Code Completion Systems

Evolution of Interface and Challenges with Inline Fixes

The Challenge of Code Generation and the Chess Player Analogy

The Evolution of Artificial Intelligence Approaches

The Importance of Context and Testing in AI Development

Broadening the Notion of the Code Graph: Beyond Git Repositories

Data Privacy and the Value of Data Pre-Processing

The Shift Towards DSL-Free Programming

The Rise of AI-Enhanced Engineering

Challenges in Integrating Code Graphs with Build Systems

Leveraging Graph Context for AI Coding Tools

The Future of AI and Scaling Models

The Value of Classic Search Techniques

The Importance of Fast Inference and Model Fine-tuning

The Future of Sourcecraft in the Engineering Org

The Power of LLMs and AI in Understanding Code

The Future of Coding Assistance

The "Normsky" architecture for AI coding agents — with Beyang Liu + Steve Yegge of SourceGraph

Latent Space: The AI Engineer Podcast

00:24Introduction and Motivation

Introduction and Motivation

03:10Steve's Experiences and Grab's Success

Steve's Experiences and Grab's Success

06:24Introduction to Sourcegraph and Cody

Introduction to Sourcegraph and Cody

08:30Differentiating Cody from other AI Coding Agents

Differentiating Cody from other AI Coding Agents

12:30Cody's Approach to Code Completions with Context Fetching

Cody's Approach to Code Completions with Context Fetching

14:03The Importance of Data Models in Code Completion Systems

The Importance of Data Models in Code Completion Systems

18:56Evolution of Interface and Challenges with Inline Fixes

Evolution of Interface and Challenges with Inline Fixes

24:00The Challenge of Code Generation and the Chess Player Analogy

The Challenge of Code Generation and the Chess Player Analogy

26:07The Evolution of Artificial Intelligence Approaches

The Evolution of Artificial Intelligence Approaches

29:18The Importance of Context and Testing in AI Development

The Importance of Context and Testing in AI Development

32:12Broadening the Notion of the Code Graph: Beyond Git Repositories

Broadening the Notion of the Code Graph: Beyond Git Repositories

35:20Data Privacy and the Value of Data Pre-Processing

Data Privacy and the Value of Data Pre-Processing

38:15The Shift Towards DSL-Free Programming

The Shift Towards DSL-Free Programming

39:35The Rise of AI-Enhanced Engineering

The Rise of AI-Enhanced Engineering

44:36Challenges in Integrating Code Graphs with Build Systems

Challenges in Integrating Code Graphs with Build Systems

46:03Leveraging Graph Context for AI Coding Tools

Leveraging Graph Context for AI Coding Tools

50:43The Future of AI and Scaling Models

The Future of AI and Scaling Models

54:38The Value of Classic Search Techniques

The Value of Classic Search Techniques

57:19The Importance of Fast Inference and Model Fine-tuning

The Importance of Fast Inference and Model Fine-tuning

1:02:58The Future of Sourcecraft in the Engineering Org

The Future of Sourcecraft in the Engineering Org

1:12:01The Power of LLMs and AI in Understanding Code

The Power of LLMs and AI in Understanding Code

1:16:01The Future of Coding Assistance

The Future of Coding Assistance