Skip to content

feat: add AgentAPI using DRPC #10811

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Dec 18, 2023
Merged

feat: add AgentAPI using DRPC #10811

merged 12 commits into from
Dec 18, 2023

Conversation

deansheather
Copy link
Member

@deansheather deansheather commented Nov 21, 2023

Creates coderd/workspaceagentapi.go which implements DRPC agentproto.DRPCAgentServer. The server is "instanced" per agent (as it will be created upon receiving a websocket connection from the agent).

TODO:

  • GetManifest
  • GetServiceBanner
  • UpdateStats
  • UpdateLifecycle
  • BatchUpdateAppHealths
  • UpdateStartup
  • BatchUpdateMetadata
  • BatchCreateLogs
  • StreamDERPMaps
  • CoordinateTailnet I'm going to leave this one unimplemented for now and Spike can implement it when it's ready
  • Change the existing HTTP API handlers to convert requests and responses to protobuf and call the RPC directly
    • GetManifest
    • GetServiceBanner
    • UpdateStats
    • UpdateLifecycle
    • BatchUpdateAppHealths
    • UpdateStartup
    • BatchUpdateMetadata
    • BatchCreateLogs
    • StreamDERPMaps
  • Add an agent endpoint for websocket <-> yamux <-> DRPC
  • Add agentsdk method to get an RPC client
  • Copy tests from HTTP API for each supported handler and update them to use protobuf
  • Move AgentAPI to it's own coderd sub-package to avoid import problems

Updates #10530

@deansheather deansheather marked this pull request as ready for review November 21, 2023 10:39
@deansheather
Copy link
Member Author

This PR is getting huge so I'm going to leave it for now and I'll do the rest of the TODOs in follow ups.

}

var sharingLevel codersdk.WorkspaceAppSharingLevel
switch protoApp.SharingLevel {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing we could do to future proof here is introduce an AllWorkspaceAppSharingLevels slice in codersdk then walk this slice and match on

strings.ToLower(protoApp.SharingLevel.String())

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this with maps instead to avoid having to iterate over all enums every time

func (api *API) workspaceAgentManifest(rw http.ResponseWriter, r *http.Request) {
// @Success 101
// @Router /workspaceagents/me/rpc [get]
func (api *API) workspaceAgentRPC(rw http.ResponseWriter, r *http.Request) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function is way to long and hard to follow. It needs to be refactored into at least three pieces:

  1. Authz --- is this the latest build?
  2. Connection tracking --- pings, updating connection times, disconnect
  3. Serving dRPC

return ensureLatestBuild, build, true
}

func (api *API) agentConnectionUpdate(ctx context.Context, workspaceAgent database.WorkspaceAgent, workspaceID uuid.UUID, conn *websocket.Conn) (func(closeCtx context.Context, ensureLatestBuildFn func() error), bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This signature seems needlessly convoluted. Why do we need to return a function to the caller, for the caller to just call it once?


// Ensure the resource is still valid!
// We only accept agents for resources on the latest build.
ensureLatestBuild := func() error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

making this function a closure makes it harder to understand because you need to pass this anonymous func() error around, out of this function and into the thing that handles the pings.

If you made it a regular function that accepted a database.Store and a database.WorkspaceBuild then it would simplify the return values of ensureLatestBuild and agentConnectionUpdate

// We use a custom heartbeat routine here instead of `httpapi.Heartbeat`
// because we want to log the agent's last ping time.
var lastPing atomic.Pointer[time.Time]
lastPing.Store(ptr.Ref(time.Now())) // Since the agent initiated the request, assume it's alive.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has the structure of storing a bunch of state in a top-level function and then creating several closures over that data, which are all inlined. I think it would be easier to understand if there were a struct that stores the data and then methods on the struct.

@github-actions github-actions bot added the stale This issue is like stale bread. label Dec 13, 2023
@spikecurtis spikecurtis removed the stale This issue is like stale bread. label Dec 15, 2023
Copy link
Contributor

@spikecurtis spikecurtis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take over the refactoring of the connection / heartbeat stuff I asked for.

@deansheather deansheather merged commit e464310 into main Dec 18, 2023
@deansheather deansheather deleted the dean/agent-api branch December 18, 2023 12:53
@github-actions github-actions bot locked and limited conversation to collaborators Dec 18, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants