Principles for Building One-Shot AI Agents

(edgebit.io)

93 points | by robszumski 242 days ago

4 comments

TZubiri 240 days ago
What is a “one-shot” AI Agent? A one-shot AI agent enables automated execution of a complex task without a human in the loop.
Not at all what one-shot means in the field. Zero-shot, one-shot and many-shot means how many examples at inference time are needed to perform a task
Zero shot: "convert these files from csv to json"
One shot: "convert from csv to json, like "id,name,age/n1,john,20" to {id:"1",name:"tom",age:"20"}
[-]
- devmor 240 days ago
  Given the misunderstandings and explanation of how they struggled with a long-solved ml problem, I believe this article was likely written by someone without much formal experience in AI.
  This is probably a case where some educational training could have saved the engineer(s) involved a lot of frustration.
  [-]
  - zavec 239 days ago
    As a casual ML non-practicioner, what was the long-solved ML problem they ran up against?
    [-]
    - devmor 239 days ago
      Both “Principle 1” and “Principle 2” in the article are essentially LLM-focused details of basic principles in ML that have been known since before I (and probably you, if you’re still working age) were born.
- robszumski 240 days ago
  Fair criticism. I was going for the colloquial usage of "you get one shot" but yeah I did read that Google paper the other day referring to these as zero-shot.
  [-]
  - tough 239 days ago
    fully-autonomous makes more sense in the agentic vicab imho
    at the end its fine if the agent self corrects amongst many shots too
sebastiennight 240 days ago
> A different type of hard failure is when we detect that we’ll never reach our overall goal. This requires a goal that can be programmatically verified outside of the LLM.
This is the largest issue : using LLMs as a black box means for most goals, we can't rely on them to always "converge to a solution" because they might get stuck in a loop trying to figure out if they're stuck in a loop.
So then we're back to writing in a hardcoded or deterministic cap on how many iterations counts as being "stuck". I'm curious how the authors solve this.
[-]
- NoTeslaThrow 239 days ago
  Surely the major issue is thinking you've converged when you haven't. If you're unsure if you've converged you can just bail after n iterations and say "failed to converge".
- bhl 240 days ago
  Just give your tool call loop to a stronger model to check if it’s a loop.
  This is what I’ve done working with smaller model: if it fails validation once, I route it to a stronger model just for that tool call.
  [-]
  - behnamoh 240 days ago
    > if it fails validation once, I route it to a stronger model just for that tool call.
    the problem the GP was referring to is that even the large model might fail to notice it's struggling to solve a task and keep trying more-or-less the same approaches until the loop is exhausted.
    [-]
    - sebastiennight 239 days ago
      Exactly. You'd still be in a non-deterministic loop, just a more expensive one.
      [-]
      - namaria 239 days ago
        Ashby in 1958 pointed out the law of requisite variety. It should have preempted expert systems and it should preempt the current agents fad. An automatic control system of general application would tend toward infinite complexity.
- randysalami 240 days ago
  I think we need quantum systems to ever break out of that issue.
  EDIT: not as to creating an agent that can do anything but creating an agent that more reliably represents and respects its reality, making it easier for us to reason and work with seriously.
  [-]
  - sebastiennight 240 days ago
    Could you share the logic behind that statement?
    Because here I'm getting "YouTuber thumbnail vibes" at the idea of solving non-deterministic programming by selecting the one halting outcome out of a multiverse of possibilities
    [-]
    - dullcrisp 240 days ago
      ELI40 “YouTuber thumbnail vibes?”
      [-]
      - sebastiennight 239 days ago
        YouTube's algorithm has created over the last ~5 years an entire cottage industry of click-maximizing content creators who take any interesting scientific discovery or concept, turn it into the maximally hypey claim they can, and make that the title of their videos with a "shocked-face" thumbnail.
        E.g. imagine an arxiv paper from French engineer sebastiennight:
        Using quantum chips to mitigate halting issues on LLM loops
        It would result the same day in a YT video like this:
        Thumbnail: (SHOCKED FACE of Youtuber clasping their head next to a Terminator robot being crushed by a flaming Willow chip) Title: French Genius SHOCKS the AI industry with Google chip hack!
      - pmichaud 240 days ago
        I think he means just try shit until something works better.
    - randysalami 240 days ago
      That would be some Dr. Strange stuff. I’m just saying a quantum AI agent would be more grounded when deciding when to stop based on the physical nature of their computation vs. engineering hacks we need for current classical systems that become inherently inaccurate representations of reality. I could be wrong.
      [-]
      - daxfohl 240 days ago
        Quantum computation is no different than classical, except the bit registers have the ability to superpose and entangle, which allows certain specific algorithms like integer factorization to run faster. But conceptually it's still just digital code and an instruction pointer. There's nothing more "physical" about it than classical computing.
        [-]
        daxfohl 239 days ago
        And it's definitely not "try every possibility in parallel", as is sometimes portrayed by people who don't know better. While quantum computing makes it possible to superpose multiple possibilities, the way quantum mechanics works, you can only measure one (and you have to decide ahead of time which one to measure, i.e. you can't ask the quantum system like "give me the superposition with the highest value"). That's why only a few specific algorithms are aided by quantum computing at all. Integer factorization (or more generally, anything that uses Fourier transforms) is the biggest, where it's exponential speedup, but most others are just quadratic speedup.
        And even if you could simulate and measure multiple things in parallel, that still wouldn't let you solve the halting problem, which would require simulating and measuring infinite things in parallel.
        Another way of saying it: everything that can be done on a quantum computer can also be done on a classical computer. It's just that some specific algorithms can be done much faster on a quantum computer, and in the case of integer factorization, a quantum computer could factor numbers larger than would ever be practical on a classical computer. But that's really it. There's nothing magical about them.
        [-]
        randysalami 238 days ago
        “Nature isn’t classical, dammit, and if you want to make a simulation of nature, you’d better make it quantum mechanical, and by golly it’s a wonderful problem, because it doesn’t look so easy” (Richard Feynman). Quantum systems are physical systems, classical systems due to their very nature only can emulate it. When it comes to agents like we were discussing before, a classical agent will always be limited by the abstractions needed to get it to understand the real world. A quantum agent would actually “get” the world. The difference is fidelity and classic systems will only ever be an approximation.
        [-]
        daxfohl 237 days ago
        This is irrelevant. First, quantum programs don't "get" anything just by virtue of being quantum code, any more than classical computers "get" the foundations of electricity and magnetism because they use electrons. Second, classical computers absolutely can simulate quantum systems. They're inefficient, but they can do it. Third, determining whether an agent is stuck in an infinite loop has nothing to do with the physical world. They're just binary code running on a Turing machine. Fourth, the halting problem is provably unsolvable in both classical and quantum systems, so there's not even a relationship here. Fifth, what do you mean by quantum systems are physical? Does this mean classical systems aren't? Are physical systems only those that use every aspect of physics? Then quantum systems aren't physical either because they can't account for gravity. So do we need quantum gravitational computers? Sixth, what does "getting" quantum mechanics that have to do with AI agents? Do I need to understand quantum physics before I can have a conversation with someone? Can I not read an email without an appeal to hilbert space? Just, none of this is related to quantum computing. It's like having a bug in a deployment and saying string theory would've prevented that.
        [-]
        daxfohl 237 days ago
        I see intermixing of the terms quantum computer and quantum system. These are different concepts, and I think that's the source of the confusion. A quantum computer is a well defined thing. It's just like a classical computer but instead of just binary bits, it can work with qubits that support superposition. But it still needs programs just like a classical computer, and the results that we can actually read are binary in both cases.
        Both of them are quantum "systems", in that both require quantum physics to work, if we're considering modern CPU gate sizes. Just, classical computers expose binary bits, and quantum computers expose qubits.
        What I think you're picturing is a quantum "system", like a blob of quantum goo, that you can toss some "state" into and...something. But, that's not what a quantum computer is, any more than a classical computer is something you could throw into a blob of electrical goop and expect it to do anything.
  - devmor 240 days ago
    I don’t believe quantum computers can solve the halting problem, so I don’t think that would actually help.
    This issue will likely always require a monitor “outside” of the agent.
    [-]
    - randysalami 240 days ago
      I think you’re right that they can’t “solve” the halting problem but are more capable at dealing with it than classic ai agents and more physically grounded. Outside monitoring would be required but I’d imagine less so than classical systems and in physically different ways; and to be fair, humans require monitoring too if they should halt or not, haha.
      [-]
      - devmor 239 days ago
        Can you explain why you think this? I’m curious.
        Humans don’t encounter an infinite loop problem because we are not “process-bound” - we may spend too long on a task but ultimately we are constantly evaluating whether or not we should continue (even if our priorities may not be the same as whoever assigned us a task). The monitoring is built-in, by nature of our cross-task processing.
        [-]
        randysalami 238 days ago
        100%. We have built-in faculties for stopping and halting. My point wasn’t that humans physically need a monitor to determine when to stop or else suffer an infinite loop; sleep, eating, and death are perfectly effective at that. I was making a bit of a joke in the efficacy of agents being subjective around halting. A classical or quantum agent might go on forever to solve its goal, getting stuck and needing an outside monitor to reset or redefine it. Contrast that to a human agent; given a goal, they might never even try to solve it in the first place! Without outside monitors, systems of human agents may not start when needed or halt when optimal yet we’ve kept it going for thousands of years!
robszumski 240 days ago
Author of the post, love to see this here.
Curious what folks are seeing in terms of consistency of the agents they are building or working with – it's definitely challenging.
lerp-io 240 days ago
u can’t one shot anything, you have to iterate many many times.
[-]
- canadiantim 240 days ago
  You one-shot it, then you iterate.
  Sounds tautological but you want to get as far as possible with the one-shot before iterating, because one-shot is when the results have the most integrity
  [-]
  - tough 239 days ago
    tighter feedback loops the closer your shotting instances are to each other