Introduction
Codecompass.ai is an innovative AI-powered tool designed to revolutionize how development teams interact with their codebases. Built by Arnab Chatterjee and Chhakuli Zingare , codecompass.ai allows team members and individuals to engage in chat with their project's code, providing instant, accurate answers to code-related questions. This powerful tool enhances productivity, accelerates onboarding, and ensures seamless collaboration, making it an essential asset for modern development teams.
The Problem
Maintaining a deep understanding of a complex codebase is a constant challenge in any development team. This complexity is amplified in larger projects, where code evolves rapidly, and documentation may lag behind. Whether you're a seasoned developer or a new joiner, navigating through extensive code can be time-consuming and sometimes frustrating. Questions like "What does this function do?"
, "Where is this class used?"
, or "How do these modules interact?"
often require digging through layers of code and documentation.
For new team members, this challenge is even more pronounced. The onboarding process, while crucial, can be slow and overwhelming as they familiarize themselves with the project's intricacies. Mentoring from senior developers is invaluable but can divert their attention from critical tasks, impacting the team's overall productivity.
However, even experienced developers can find themselves lost in a sea of code when revisiting less familiar parts of the project. Every team member, regardless of their tenure, faces moments of uncertainty where they need quick and reliable answers about the codebase.
The Solution
This is where codecompass.ai steps in, revolutionizing how teams interact with their code. By leveraging advanced AI technologies, codecompass.ai acts as an intelligent assistant that provides immediate, accurate answers to any code-related query. Imagine being able to ask, "How is this function connected to other parts of the system?" and receiving a comprehensive, clear response within seconds.
This is how it all works, We will discuss the inner workings little bit later in the blog:
For new joiners, codecompass.ai significantly reduces the onboarding time. They can quickly get up to speed without feeling overwhelmed, asking the AI questions about the codebase and receiving instant, relevant information. This allows them to integrate into the team and start contributing faster, fostering a more efficient and inclusive development environment.
For experienced developers, codecompass.ai is a powerful tool that enhances productivity. It eliminates the need for tedious searches through documentation or code files, providing immediate insights and freeing up time for more critical tasks. This AI assistant becomes an integral part of the development process, ensuring that all team members have easy access to the knowledge they need, exactly when they need it.
Key Features of Codecompass.ai:
Codecompass.ai is packed with features designed to enhance productivity, streamline onboarding, and facilitate seamless collaboration within development teams. This innovative tool provides developers with a powerful AI assistant that integrates effortlessly into their existing workflows, ensuring that critical information is always just a query away. Here are the key features that make codecompass.ai an indispensable asset for modern development teams:
Contextual Q&A Engine
- Get instant, accurate answers to your code-related questions. Understand a function's purpose, explore class interactions, or find where a module is used.
Data Confidentiality
- Your code and data remain private and secure. Stored in separate Pinecone indexes, ensuring confidentiality and data protection.
Seamless Repository Integration
- Easily add your repository by providing a GitHub link. Quick setup and effortless exploration of your codebase.
Real-time Code Updates
- Automatically pulls the latest code from GitHub, ensuring you always have up-to-date information.
Natural Language Processing
- Interact with your codebase using plain English. Ask questions naturally, and codecompass.ai will provide relevant answers.
Comprehensive Documentation Access
- Seamlessly access and search your project’s documentation. Integrates with documentation tools, reducing the need to switch contexts.
User-friendly Interface
- Enjoy a simple and intuitive user interface designed to enhance productivity. Easy navigation and interactive elements for quick information access.
How We Built It
Building codecompass.ai was an exciting journey that involved careful planning, selecting the right technologies, overcoming challenges, and meticulous coding. Here’s a brief overview of the key steps we took to bring this innovative tool to life.
1. Planning
Me (Arnab Chatterjee) and Chhakuli Zingare began brainstorming with a clear goal: to create something that would make a real difference for developers. We reflected on our early days as junior developers, recalling the countless hours spent trying to understand complex codebases. This shared experience sparked the idea for codecompass.ai—a tool designed to make navigating code easier for everyone.
With this inspiration, we dove into planning sessions. Our initial meetings were filled with energy and excitement as we brainstormed potential features, discussed the user flow, and debated the best technology stack to bring our vision to life. We sketched out our ideas on an Excalidraw screen, mapping out how codecompass.ai could transform the development process.
2. Choosing The Tech Stack
Choosing the right tech stack was crucial to bringing our vision for codecompass.ai to life. We needed technologies that were robust, scalable, and could integrate seamlessly to provide a smooth user experience.
For the backend, we opted to use OpenAI API for vector embedding and large language models (LLMs). The OpenAI API provided the advanced natural language processing capabilities we needed to accurately understand and respond to code-related queries.
We used Pinecone for efficient vector storage and retrieval, ensuring fast and accurate access to our data. LangChainJS was incorporated to manage and streamline interactions with the LLM, making the integration process more efficient and reliable.
On the frontend, we chose Next.js for its powerful features and seamless integration with our backend. To create a modern, responsive, and visually appealing user interface, we used ShadCN and Tailwind CSS. These tools allowed us to design a user-friendly interface that enhances productivity and user engagement.
3. Hurdle
One of the significant challenges we faced was pulling a GitHub repository and transforming it into a format that our LLM could effectively read and interpret. Initially, we struggled with the unstructured data and JSON formats, which didn't yield good results when processed by the LLM.
To overcome this, we developed our own logic to convert the repository data into a structured text document. This structured format allowed the LLM to understand the code better and provide more accurate and contextual responses. We stored this structured data in Pinecone, indexing it by file path to ensure efficient retrieval.
Example:
Plain Text:
[DIR_START]src
[FILE_START]src/index.js
// File content here
[FILE_END]src/index.js
[FILE_START]src/utils.js
// File content here
[FILE_END]src/utils.js
[DIR_END]src
This format clearly shows:
- Directory structure ([DIR_START] and [DIR_END])
- File locations and contents ([FILE_START] and [FILE_END])
- Preservation of code and text exactly as it appears in the repository
As part of this solution, we created an npm package called git-repo-parser. This package pulls data from GitHub and transforms it into an LLM-friendly text format. You can find the package at GitHub and npm. This tool has been crucial in making codecompass.ai effective and reliable for users.
4. Coding
The coding phase of codecompass was a collaborative effort, with each of us focusing on different critical components to bring the project to life.
Chhakuli started by building the homepage, ensuring it was both welcoming and functional.
Meanwhile,I began working on the backend, focusing on reading a GitHub repository and storing the data in a structured format, feeding it to open ai API to generate vector embedding, and sending it to our vector database. This involved proper chunking of the data to ensure efficient retrieval and processing.
As Chhakuli moved on to developing the chat screen, I concentrated on implementing the logic to retrieve data from the vector database. This was crucial for producing accurate and contextual responses, complete with citations to the source code.
This diagram represents the architecture responsible for processing the user's question, leveraging chat history, and an initial prompt, and retrieving relevant information from the Pinecone vector database to generate a well-crafted response.
The process begins with the user submitting a question, which includes context from the chat history. This question and its context are then transformed into a vector representation using a vector embedding model.
The vector representation is queried against the Pinecone vector database, which returns the top 5 most relevant results.
These results are used to create a comprehensive context for the user's question. If needed, this context is further refined to generate a new, more specific question. The refined context and question are then processed by the Large Language Model (LLM), which generates a detailed answer, citing sources as necessary.
This answer is delivered back to the user, ensuring that codecompass.ai provides accurate, contextually relevant, and well-cited responses to user queries, enhancing the overall productivity and efficiency of development teams.
Afterward, my focus shifted to fine-tuning the LLM. This involved tweaking the code and refining the prompts to ensure the AI could accurately read and point to the correct file paths within the codebase and Chhakuli Zingare worked on Refactoring the codebase and adding style and design to the chat page. Our combined efforts in these areas were essential in creating a seamless and efficient user experience for codecompass.
So this is how the Chat screen looks like:
Getting Started with Codecompass.ai
To test our app, go to this link and try it out with an already-added project.
Type the question in the chat box and you are good to go
To add your own project:
Clone the CodeCompass repository.
Add the environment variables in
.env
(check.env.example
for reference).Run
npm install
, thennpm run prepare:data <your GitHub repo URL>
this will Process and push vector embedding of your codebase to your pinecone index that you have setup in your .env file
You are good to go! You can go to https://localhost:3000/chat/<id>
to talk to your codebase and even self-host it for your organization to use
You can see your newly added project in the homepage itself and click explore to have a chat with your codebase
And you are good to go!
Future Directions and Improvements
We are excited about the future of codecompass.ai and have several enhancements planned to make the tool even more powerful and user-friendly:
Easy Login via GitHub
- Implementing a seamless login process using GitHub credentials for quick and secure access.
Simple Project Addition
- Allowing users to add projects directly on the website in just a few clicks, making the setup process even more straightforward.
Chat History
- Introducing the ability to save and view chat history, so users can refer back to previous interactions for continuity and context.
Enhanced Code Explorer
- Developing a more robust code explorer to improve navigation and understanding of the codebase.
Upscaling to Advanced Models
- Upscaling to a text-embedding-large model and possibly integrating GPT-4o LLM for more accurate and nuanced responses.
Save and Export Chat History
Adding features to save chat history and export it, allowing users to
keep records of their interactions and share insights with their teams.
These planned improvements aim to enhance the functionality, usability, and overall experience of codecompass.ai, making it an indispensable tool for development teams.
Conclusion
Building codecompass.ai has been an exciting and rewarding journey for me and my teammate, Chhakuli Zingare We developed this project for the #AIForTomorrow hackathon by Hashnode, and we are incredibly grateful for the opportunity to participate in such a forward-thinking event.
Codecompass.ai is designed to transform the way development teams interact with their codebases. By simplifying onboarding, boosting productivity, and enhancing collaboration, we believe our tool can make a significant impact on modern development workflows. With features like instant contextual Q&A, real-time code updates, and seamless repository integration, codecompass.ai is equipped to address common challenges faced by developers.
We invite you to explore codecompass.ai and experience its benefits firsthand. Visit our App to try it out with an existing project, or follow our setup guide to add your own. We are excited about the potential of codecompass.ai to empower development teams to work smarter and more efficiently.
If you loved our project please like, comment and give us a ⭐ in Github. If you have any queries or feedback, please do not hesitate to comment or reach out. We look forward to hearing from you and seeing how codecompass.ai can make a difference in your development process.