Skip to content

Conversation

@majiru
Copy link
Contributor

@majiru majiru commented Nov 14, 2025

What changed?

Add a new dynamic config for Auto Enabling fairness and priority if we see the relevant tasks coming in.

Why?

Seamlessly start to transition users who start using the fields over to the new code path.

How did you test it?

  • built
  • run locally and tested manually
  • covered by existing tests
  • added new unit test(s)
  • added new functional test(s)

Potential risks

Due to storing this in the userdata we are using that interface a bit more, we also need to change the initialization such that we start it before being able to substantiate the defaultQ, this change in initialization might have unintended side effects that I'm not currently seeing.

Notes

I've left comments for pieces of code that I am less sure or unsure about, or where things different slightly from our original plan. Leaving as a draft until I can get some new tests for testing this more thoroughly.

@majiru majiru force-pushed the fairness-and-priority-auto-enable branch from c778149 to 87ae80e Compare November 14, 2025 23:56
method.

Even with the channel providing guardrails on initialization, the
previous attempt would still cause spurious panic's with newPM
being nil by the time we attempt to call the method on it.

I didn't take a look at the generated assembly but I assume that even
with the channel block there was still a race as to when the address of
newPM was copied from the outer scope's stack into the closure's stack.

This could likely be avoided by taking the address of the pointer and
passing that instead, but started to feel too clever to me.
@majiru majiru force-pushed the fairness-and-priority-auto-enable branch from 2bf07fe to b96901b Compare November 17, 2025 19:03
The test code here needs to half initializaton, and then set the
defaultQueue itself. With the setting of config moved over to the actual
initialization we need to replicate some of this code in the test to
keep the same behavior as before.
Copy link
Contributor

@dnr dnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the PR title: it should just talk about auto-enabling priority/fairness. the fact that there's a dynamic config to enable the auto-enabling is the least interesting part.

@majiru majiru changed the title dynamic config for auto enabling priority and fairness auto enabling priority and fairness Nov 20, 2025
Copy link
Contributor

@dnr dnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(trying out new code review tool, apologies if any of the comments end up on the wrong lines)

int64 version = 2;
}

enum FairnessState {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put this at the top of the file between imports and the first message

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had it this way originally as well, but the proto linter disagrees with this (a bit silly I know):

- file_path: temporal/server/api/persistence/v1/task_queues.proto
  problems:
  - message: Messages should precede all top-level enums.
    location:
      start_position:
        line_number: 11
        column_number: 1
      end_position:
        line_number: 16
        column_number: 1
    rule_id: core::0191::file-layout
    rule_doc_uri: https://linter.aip.dev/191/file-layout

pm.versionedQueuesLock.Lock()
defer pm.versionedQueuesLock.Unlock()
pm.initCancel()
queue, err := pm.defaultQueueFuture.Get(context.Background())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could block? or is initialize() guaranteed to resolve the future quickly after initCtx is canceled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the later, its a bit tricky, at some point in writing this I had a comment block explaining my thinking about but I must have removed it before pushing. You asking this is probably more evidence that I should put something back there.

My thinking on this is initialization can be in one of three states, it could be before we make the call to wait on the userdatainitialization, currently blocked on initialization, or finishing up the initialization after the wait call returns. In the first two, cancelling the context will short circuit the rest of initialization and not block Stop. However the last one is a bit more complex, there isn't a great spot to short circuit the initialization after the wait call, and we don't want to race with the subscriptions. So my thought was that as long as there is nothing that can block us for long after that first Wait call, then we should just let it finish first before tearing down. The idea being that the rest of initialization should not have any more blocking calls.

I'm not in love with this solution, and I'm open to discussing alternatives, but I think this works "ok" as-is.


ctx, cancel := context.WithCancel(context.Background())
cancel()
defaultQ, err := pm.defaultQueueFuture.Get(ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just use defaultQueue()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I avoided it because if the queue is not ready defaultQueue() has that softassert in it, in this case it not being ready yet is more expected and not really an error, so I didn't want to spam the output for it. Perhaps that should be in a comment there? Or maybe defaultQueue takes another flag? If we add a GetIfReady() to our future implementation I think my preference would be to use that and a comment why we dont use defaultQueue()

@awln-temporal awln-temporal self-requested a review December 5, 2025 20:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants