feat: add an endpoint to manually resume a coder task#21948
Conversation
d85afdb to
f65de1b
Compare
f65de1b to
532de3c
Compare
the endpoint is now functional. however, there are a few non-functional changes to still make. 1. The pause and resume endpoint are nearly identical. It might be good to deduplicate them. 2. The authz tests need to be generalized
|
@mafredri @DanielleMaywood I still plan to make improvements to the testing based on the feedback in #21889, and I’m actively working through the best approach. I don’t want that to block progress on the broader change, though. While I sort out the testing updates, could you please review this PR? Aside from the non-functional testing concerns, it’s otherwise ready for review. |
DanielleMaywood
left a comment
There was a problem hiding this comment.
Change looks fine to me other than a few issues with the tests but they mirror the prior PR so I understand there is a plan to rectify them.
I'm happy with this change but delegating hitting the green button to @mafredri as I don't fully have enough context on the workings of pause/resume
| ctx := testutil.Context(t, testutil.WaitLong) | ||
| client := coderdtest.New(t, &coderdtest.Options{IncludeProvisionerDaemon: true}) | ||
| user := coderdtest.CreateFirstUser(t, client) |
There was a problem hiding this comment.
I realize this is a systemic issue already existing in our tests but we should define ctx just before first use (I'll only leave a comment on this one to save having multiple comments of the same thing).
There was a problem hiding this comment.
It's great that you flagged this Danielle. And for context, premature ctx initialization leads to flakes due to test setup not being accounted for (coderdtest.New, among others).
| workspaceBuild := dbfake.WorkspaceBuild(t, db, database.WorkspaceTable{ | ||
| OrganizationID: user.OrganizationID, | ||
| OwnerID: user.UserID, | ||
| }).Do() |
There was a problem hiding this comment.
Is it possible to use dbfake.TemplateVersion here? Not a blocker just felt weird seeing a WorkspaceBuild on a test called "No Workspace for Task" 😄
| t.Run("Workspace not found", func(t *testing.T) { | ||
| t.Parallel() | ||
|
|
||
| ctx := testutil.Context(t, testutil.WaitShort) | ||
| db, ps := dbtestutil.NewDB(t) | ||
| var workspaceID uuid.UUID | ||
| wrapped := aiTaskStoreWrapper{ | ||
| Store: db, | ||
| getWorkspaceByID: func(ctx context.Context, id uuid.UUID) (database.Workspace, error) { | ||
| if id == workspaceID && id != uuid.Nil { | ||
| return database.Workspace{}, sql.ErrNoRows | ||
| } | ||
| return db.GetWorkspaceByID(ctx, id) | ||
| }, | ||
| } | ||
| client := setupClient(t, wrapped, ps, nil) | ||
| user := coderdtest.CreateFirstUser(t, client) | ||
| task, workspaceIDValue := setupWorkspaceTask(t, db, user) | ||
| workspaceID = workspaceIDValue | ||
|
|
||
| _, err := client.ResumeTask(ctx, codersdk.Me, task.ID) | ||
| var apiErr *codersdk.Error | ||
| require.ErrorAs(t, err, &apiErr) | ||
| require.Equal(t, http.StatusNotFound, apiErr.StatusCode()) | ||
| }) |
There was a problem hiding this comment.
I'm a little confused by this test.
Just re-read the prior PR and saw the discussion around this test style so ignore this
mafredri
left a comment
There was a problem hiding this comment.
Considering the endpoint is fairly simple, works for the common case, and has good test coverage. I'm fine unblocking here.
I'd still like to see tickets tracking the follow-up work (test refactoring, and task state handling). Let me know if you want help creating those.
| codersdk.ProvisionerJobStatus(job.JobStatus), | ||
| codersdk.WorkspaceTransition(latestBuild.Transition), | ||
| ) | ||
| if workspaceStatus == codersdk.WorkspaceStatusRunning { |
There was a problem hiding this comment.
I would suggest that the state handling here needs to be re-thought a bit. I know postWorkspaceBuildsInternal has its own protections against creating a new build, but that's not the only signal we care about for tasks. We also want to help the user resolve state transition issues, whereas workspace running/status does not translate to task and may lead to confusion.
This can be a follow-up task, and some of it is applicable to the pause endpoint too.
Some context:
This error could mask some things about the task state and we're repeating a lot of the work already covered by task.Status. I think we should primarily handle it, and fall back to looking at the workspace job.
If task status is queued, initializing, or active. It means we can't resume (this covers job pending/running/completed and workspace queued/starting/running).
Then there's the gnarly task status error, which can mean any number of things. We could potentially make use of task.StatusDebug field to output more detail about what's wrong. I assume we'd still want workspace running to indicate that it needs to be paused before it can be (re-)resumed.
The unknown status also means something is wrong with the build. I'm unsure of the top of my head if canceling a build results in error or unknown.
Some questions to consider:
- Which states allow resume (and consequently pause) transition?
- How do we convey the resolution path to the user?
- If it's an unhappy path, do we need to suggest the user needs to manually control the workspace?
- Future proofing? Separation between task status and workspace status may become more relevant in the future.
| ctx := testutil.Context(t, testutil.WaitLong) | ||
| client := coderdtest.New(t, &coderdtest.Options{IncludeProvisionerDaemon: true}) | ||
| user := coderdtest.CreateFirstUser(t, client) |
There was a problem hiding this comment.
It's great that you flagged this Danielle. And for context, premature ctx initialization leads to flakes due to test setup not being accounted for (coderdtest.New, among others).
|
@maf I'm tracking follow-up work specifically related to the testing directly in the description of the follow-up PR: #21889. I can write up an issue specifically if you'd like, but I don't consider it necessary to write an issue specifically. This is part of coder/internal#1262 imo. |
Closes coder/internal#1262.
This PR adds:
POST /api/experimental/tasks/{user}/{task}/resumeendpointtask_resume