I first experimented with modern AI tools, including stable diffusion image models and GPT-based chat applications, about two years ago. Since then, I’ve kept up with the latest developments and tried incorporating these tools into my daily workflow. The AI programming tools I initially tested a year ago were interesting but didn’t significantly impact my work.
Recently, however, these tools have improved dramatically. I now use them daily, and they’ve positively impacted my productivity. For the initial prototyping of the Athena project, these AI tools have been invaluable. Using unfamiliar tools to build something new, I’ve been able to work much faster and more confidently, effectively doing the work of two or three engineers.
After working intensely with these tools for the past few months, I’d like to share my observations: what these tools can do today, their current limitations, and how we can ensure they continue to improve and reach their full potential.
Thing #1: What I Have Today 🤗
“AI is not a man in a box. It's a million minds in a web.” - ChatGPT, attempting (and failing) to quote Ethan Mollick
Today, I have an AI companion who excels at some tasks but fails miserably at others, even those that seem straightforward. Ethan Mollick refers to this as the jagged frontier of AI abilities. On the positive side, my companion is exceptionally good at explaining other people’s code. Before this project, I didn’t have much experience with Rust. While Rust is readable, I struggle with sophisticated macros and high degrees of abstraction. My AI companion can turn pages of incomprehensible code into clear, well-structured paragraphs explaining what the code does and why it’s implemented that way. This is a very powerful tool, it’s something that even an experienced human coder would struggle with. (Reading someone else’s code is hard.)
Another positive is that my companion is good at what I call sparring. I can discuss ideas and designs with this companion, and it usually does a good job of outlining the pros and cons of each option. However, I must ask clear, concise questions to get useful answers and sometimes rephrase questions multiple times to get a reasonable response.
When I ask my companion directly how to do something, I often get a list of options, some of which are wacky. Even though I’m not a Rust expert, I’m experienced enough to know these suggestions aren’t good. Sometimes my companion defends itself, going deep in the wrong direction, and it can be hard to coax it back.
For basic tasks, like extracting a design pattern from a code base, my companion is completely unable and unwilling to help, which is vexing.
My companion isn’t great at writing code itself but helps me code better. It acts like a pair programmer, suggesting things while I work. Its suggestions are right about 80% of the time, but I have to be very careful of the other 20%. It can guess what I’m trying to write and scaffold a method, but I always have to double-check its work as it makes both big and small mistakes regularly. Worse, it never asks clarifying questions, always assuming, even when wrong.
Letting my companion write code is dangerous, like having autopilot in your car doing wacky, dangerous things every few minutes. You have to keep your hands firmly on the wheel.
Despite these issues, my companion helps me get less stuck and find answers much faster than before. With its help, I feel confident writing complicated code in a language I’m not very familiar with, which is exciting.
Thing #2: What I Want 🎯
"The best way to predict the future is to invent it." – Alan Kay
What I’d really like is people—teammates and collaborators in the room with me to whiteboard, discuss problems, and brainstorm ideas. I wish I had an army of interns to handle grunt work so I could focus on the exciting parts of the project.
Failing that, I want an AI architect to do what I do today, so I can focus on fun things. If that’s not possible, I’d settle for an army of intelligent AI companions that don’t need as much babysitting as my current companion.
Ideally, I want an AI that’s smarter than I am. I want a companion on par with someone I’d hire to join my team. My key to success is being the dumbest person in the room. Failing that, I’d take one as smart as I am.
In addition to intelligence, I need trustworthiness. Trust and intelligence aren’t the same, but they are correlated. I want to delegate tasks, knowing they’ll be handled, and expect my companion to come back only if it’s really stuck. I don’t want an AI that returns with a hundred dumb questions. I need someone I’d actually hire, not an idiot savant intern good at a few things but terrible at most.
I can use my companion as a sparring partner to poke holes in my design, but that’s insufficient. I want to delegate tasks and know they’ll get done. This could mean delegating high-level design and architecture or low-level drudgery. Either would be helpful, but neither is possible today. I want to write a straightforward prompt like, “an RV32E VM that is secure and performant both when compiled and run natively, as well as when run through a ZK circuit, that also integrates seamlessly with a blockchain and knows about things like accounts, balances, and states,” and expect a working end-to-end prototype.
If I gave this task to a reasonably intelligent, resourceful human, it might take a while, but the task would be clear and I’d expect it to get done. A competent human assistant would clarify the task by asking questions and doing homework. They’d work independently and return only when they have something to show or a critical question.
If I give this prompt to my AI companion, I get a reasonable response and a game plan. I can debate and ask questions, even get code samples—but those samples have big holes. I can’t just say, “do it all yourself.”
I want not only human-level intelligence (we’re not there yet) but also human-level initiative, determination, creativity, and a desire to impress and do good work. Whether it’s possible to achieve this in a machine without making it effectively human remains to be seen, but it’s worth trying.
Thing #3: What To Do 🔮
"We cannot solve our problems with the same thinking we used when we created them." – Albert Einstein
AI tools have steadily improved over the past few years, but it’s not guaranteed that this trend will continue. Even if it does, there are risks. The biggest risk is that AI goes the way of the web and apps we have today, where the interests of big companies overshadow those of everyday users. How do we ensure AI tools improve and reach their potential without compromising user interests and data? Here are four critical requirements.
First, AI development must continue rapidly. Pausing or slowing down AI due to vague fears of safety or alignment would be a mistake. While the probability of catastrophic outcomes (p(doom)) isn’t zero, it’s quite low. As with nearly every other technology, the benefits far outweigh the risks. AI tools can only fulfill their potential if we keep developing them rapidly.
Second, we need open source across the AI stack, including backend models, frontend inference engines, and everything in between. Open source models like stable diffusion and Llama are great steps but not enough. We need many domain-specific open source models and inference engines. The Internet only succeeded due to open protocols like HTTP and SMTP, and software thrived after the open source movement took off. Despite this foundation, we allowed open Internet protocols to be captured by big companies. We must establish the right, open foundations for AI infrastructure before it’s too late.
Building in AI today is difficult because, like software before Linux, most things are proprietary. If an aspiring entrepreneur wanted to build a tool like the companion I described, they’d have a few general-purpose “frontier” models, but limited options for software-specific models. We can do better.
Third, we need decentralization and censorship resistance. Tools like ChatGPT are unable to discuss many topics due to artificial restrictions. Some are sensible, like not telling a teenager how to harm themselves, but most are silly and political (try asking for a dirty joke). If we believe AI tools will transform our future, it’s crucial that everyone has access to uncensored tools, free from the subjective sensibilities of a small group of Silicon Valley elites. Venice.ai shows what’s possible with open source and decentralization. We need more open source, more open models, and more decentralization to prevent dystopian AI outcomes.
Finally, we need a lot more experimentation. While I’ve seen many AI tools for image generation and chat, there are few specifically for coding. Other domains would benefit from greater experimentation, but this is limited by the lack of open source models and infrastructure.
Another limitation is changing culture and behaviors. In a recent seminar I attended with several hundred senior executives, only about 20% had used AI tools, and even fewer used them daily. I’m surprised how few people I know use AI tools in their daily workflow, and of those who do, few use tools other than Copilot and ChatGPT regularly. We need more experimentation and a mindset that enables and celebrates it. Companies must be unafraid to put AI tools in the hands of their employees.
By focusing on these principles, we can ensure that AI tools not only improve but also remain aligned with the interests of all users. The potential of AI is immense, and with the right approach, we can harness this power to create a more efficient, innovative, and equitable future.