Skip to content

Conversation

@anna239
Copy link

@anna239 anna239 commented Dec 11, 2025

Define a way for an agent to declare different ways to authenticate, this will allow clients to present better UX to users.

@anna239 anna239 requested a review from a team as a code owner December 11, 2025 05:42
Copy link
Member

@benbrandt benbrandt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option might be:

  • provide two new capabilities on the client
    • Request text
    • Request to open link
  • The agent returns auth methods as they do now
  • User picks one of the options
  • Within the authenticate method, the agent asks the client for a key and optionally a link to open to get said key or device code

The reason I bring this up is usually for the oauth flows there are specific links and maybe the need to spin up a local http server (at least for codex this is the case) to handle the redirect post-auth. So I don't know that the agent can upfront provide the urls without entering in to a specific auth flow.

Maybe what i am proposing is too generic and opens pandora's box though....

"name": "OpenAI api key",
"description": "Provide your key",
"type": "envVar",
"varName": "OPEN_AI_KEY",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: is it too late to add the env var since the process has already started?
Could we maybe just model this as a request for a text field that the user pastes the token into? and then the agent would store it somewhere like it usually does?

Or maybe you had an idea here that I am missing?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case we'll have to restart the process indeed. If the agent is ok with just accepting a key in JsonRpc call, then it should declare 3. option — Provided key

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think this method should rather be included in #289 ? i.e. static information known before startup?

But I guess we want to allow the user to choose? So by choosing this we'd restart the agent for them but at least they could choose if that is what they want from the options?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I guess we want to allow the user to choose? So by choosing this we'd restart the agent for them but at least they could choose if that is what they want from the options?

yes, that was my idea

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below may be too nuanced for the example, which could be best discussed as "ENV Only" "Env Supported" which happens.. Wherever there is support for a "restart only" config (args, env, files) it ideally is representable in the discovery/registry format in a primitive. In worst case we can just punt the credential scope as "restart" because people can always read the docs to figure out what that means.

I think this discussion ties into the #289 (registry) issue.. we need out-of-band info about auth when it implies a process restart. Also, in some cases there is a tiering. many agents have a env fallback to a file or some other process. Not to complicate this, but if something is strictly ENV only then it should be definable pre-start (registry) and if not, we shouldn't mistake the option for an ENV for only supporting ENV

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I got your idea right, you suggest putting some auth options (env vars) to the registry, and some (oauth) would strill be returned by the agent during its work. I thought about it but decided against for the follwing reasons:

  1. it's better to have same concepts in one place
  2. user can decide on auth mechanism, when they see all the options
  3. if we want to return available options in AuthRequired Error, we'll need Env options there anyway

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for responding. seems reasonable

@anna239
Copy link
Author

anna239 commented Dec 11, 2025

@benbrandt why I think approach with agent requesting a text input won't work as good as we want it to :
On the client we would like to understand that a request is an authentication request, we can present it in a different way, we can also suggest a user to input their LLM token they've already provided in the ide

@codefromthecrypt
Copy link
Contributor

Ack ACP and MCP are different specs (editor-agent vs. tool integration), but MCP's elicitation (URL mode) provides prior art for out-of-band credential handling that avoids LLM/agent exposure. It pauses for browser input without touching the protocol flow. Is this relevant to the design discussion for how to handle sensitive info exchange in ACP?

https://modelcontextprotocol.io/specification/2025-11-25/client/elicitation

@benbrandt
Copy link
Member

Yeah I think MCP elicitation is not a bad way to approach it for two reasons:

  1. Client will likely already need to support it if we want to allow MCP servers to use it
  2. I think the two use cases it covers are roughly what we need

I have some quibbles with the api design... but I don't know that it is worth changing for reason 1 above.

The interesting thing will be that this elicitation would come outside the context of a session. Which I believe you brought up yesterday @anna239 in our call: we may want a very explicit stage for this, because if we allow for elicitation, we'd need to know if it should show up within a session feed or somewhere else.

@anna239
Copy link
Author

anna239 commented Dec 12, 2025

I agree that URL-elicitation mechanism would work well for the oauth scenario, let me update rfd with this approach

@anna239
Copy link
Author

anna239 commented Dec 13, 2025

Added part about MCP-like elicitation mechanism for oauth, @codefromthecrypt thank you so much for pointing this out to me.
Not sure, maybe I should create a separate RFD for the elicitation mechanism, as it's not limited to authentication only and can be used separately too.

"name": "OpenAI api key",
"description": "Provide your key",
"type": "startParam",
"paramName": "OPEN_AI_KEY",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this also be handled if we supported the full elicitation options?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's more like env var, so this one also requires restart

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we add requiresRestart would also be here

@codefromthecrypt
Copy link
Contributor

I began studying this today (including which editors handle things similarly and might be impacted). I didnt finish that research but dont block on me. I will comment before or after the fact once I parse things well enough to be relevant!

Copy link
Contributor

@codefromthecrypt codefromthecrypt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like real progress

I think these folks who are building auth handling can also validate, and maybe not mentioned?

Meanwhile I think I read @anna239 suggested to split this

  1. Auth Types RFD - Just the 4 types + requiresRestart + authParams
  2. Elicitation RFD - General-purpose mechanism, not auth-specific

Happy to see things moving

"name": "OpenAI api key",
"description": "Provide your key",
"type": "envVar",
"varName": "OPEN_AI_KEY",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below may be too nuanced for the example, which could be best discussed as "ENV Only" "Env Supported" which happens.. Wherever there is support for a "restart only" config (args, env, files) it ideally is representable in the discovery/registry format in a primitive. In worst case we can just punt the credential scope as "restart" because people can always read the docs to figure out what that means.

I think this discussion ties into the #289 (registry) issue.. we need out-of-band info about auth when it implies a process restart. Also, in some cases there is a tiering. many agents have a env fallback to a file or some other process. Not to complicate this, but if something is strictly ENV only then it should be definable pre-start (registry) and if not, we shouldn't mistake the option for an ENV for only supporting ENV

"name": "OpenAI api key",
"description": "Provide your key",
"type": "startParam",
"paramName": "OPEN_AI_KEY",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we add requiresRestart would also be here


3. Agent auth

Same as what there is now – agent handles the auth itself, this should be a default type if no type is provided for backward compatibility
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should be explicit that Agent auth MUST not require a restart unless explicitly marked otherwise (due to some facet not covered by ENV like static file cold reload)


### AuthErrors

It might be useful to include a list of AuthMethod ids to the AUTH_REQUIRED JsonRpc error. Why do we need this if they're already shared during `initialize`:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe an example like { "code": -32000, "data": { "availableMethodIds": [...] } }

@ignatov
Copy link
Contributor

ignatov commented Dec 26, 2025

Hey! What do you think about _meta.terminal-auth should we graduate it and make official?

@ignatov
Copy link
Contributor

ignatov commented Dec 26, 2025

Hey!

I published a very early version of the ACP registry: https://github.com/agentclientprotocol/registry

For now, I have listed only agents that have some authentication methods. At the moment, this is mostly opening a browser and starting an HTTP server from the agent process, or launching a terminal via _meta.terminal-auth.

I think that supporting a proper authentication flow is crucial for smooth distribution in editors and IDEs.

@anna239
Copy link
Author

anna239 commented Jan 5, 2026

@codefromthecrypt @benbrandt hi folks, what do you think about merging this and starting the implementation? Or are there any more concerns we have to resolve before that?

Copy link
Member

@benbrandt benbrandt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm reading the current state and thinking about this some more, I think we need to think through how we handle the workflows that cannot handle reauthenticating while running (needing a restart)

I think we should move this to a synchronous call to talk through, maybe I am just missing something here, but I feel like we might be introducing too many variations here, or at least they could potentially be providing in a simpler way

"type": "envVar",
"varName": "OPEN_AI_KEY",
"link": "OPTIONAL link to a page where user can get their key",
"requiresRestart": true // default
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly:

  • requiresRestart: true = the client takes the form value and restarts the process with it?
  • requiresRestart: false = the agent takes the form value and uses it somehow? In which case this doesn't really feel like an "env var" but rather just an elicitation request for some text value the agent wants?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea was that requiresRestart=true means that you need to restart the agent AFTER you call auth method. F.i. for auggie you need to restart an agent after terminal auth.
Now I better understand how current env var auth works, and I'm not sure if requiresRestart should be true for it 🤔 if you already have API_KEY var set up, then you can just call the auth method without user entering anything and probably without restart. And if API_KEY is not set, this can be detected by client independent from the agent and naturally after user enters api key the client will have to restart the process with this env var and after that call the auth method. Does it make sense?


4. Terminal auth

What's now done with `terminal-auth` capability and \_meta field, for backward compatibility clients might want to read properties from both \_meta and the object itself.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think reading from _meta for now is just for clients who use it now. I don't think we want it in the spec

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to pull that behaviour from _meta to the top layer

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I just wonder if we need it in the spec that they should check the _meta field is all

"id": "123",
"name": "Run in terminal",
"type": "terminal",
"command": "/path/to/command",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In zed, this is relative to the folder in which we extracted the agent... Basically it is an interesting question of what does this command refer to, and it should somehow be scoped to the agent itself likely

i.e. Should we assume the command is the agent itself with different args?
The issue is, the agent might be invoked via node or some other command, in which case parsing the args gets difficult... But it feels strange to allow the agent to use a different command than itself (a current problem we have with Zed's test implementation)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you suggest to pass a cmd option or a parameter?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure... basically there is the issue that the agent can't express an absolute path because it doesn't know where it has been downloaded or what other programs might be available (though I think they shouldn't be allowed to)

So what is the right way (possibly just through documentation) of how an agent can express a command (hopefully itself) to be run with different args

but if it is run via node /some/path/index.js then it needs to supply some args as well. But they may not be the exact same as the acp mode because it likely needs a tui

We have a bunch of caveats for all of this in zed extensions using it right now... I just wonder if there is a better way to express this that is portable (i.e. we can tell people we will make sure node is available, but not every client can, etc)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that passing a full path to the binary is okay, dancing with node/other runtime paths is error prone, I'd prefer to avoid it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But how would the agent know the path? Would it be relative to unpacking the binary?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants