-
Notifications
You must be signed in to change notification settings - Fork 622
Preview Vector Set API support #1346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
This all makes sense. In storage-v2, we have the notion of a LogRecord record format, that already includes optional fields such as ETag and Expiration. Adding a namespace would be analogous to this, with a couple of differences:
TBD: how to handle larger namespace names in the same framework (e.g., "/user/foo/"), and is that useful/necessary to make as a first-class citizen versus users directly incorporating in the actual key. |
…h ideally we never hit this case)
Namespaces can be used for a lot of things, like replacing the current numbered database implementation with something that is supported cluster-wide as in valkey (In that case they end up in the same AOF file I guess?). Also, quite a few RESP-accepting DBs have a namespace implementation of sorts, e.g. kvrocks. I had an idea of combining ACLs and database numbers (IMHO, this would be usually better than redis's prefix ACLs, because it doesn't require client to cooperate), and namespaces would be just as natural here. |
…e recursive locking shenanigans; include element ids in benchmark data
…to vectorApiPoC
…t should look enough like v2 for government work
…ling them just yet
…ng pending operations in VectorManager
| throw new GarnetException($"Could not switch replication replay session to {self.dbId}, replication will fail"); | ||
| } | ||
|
|
||
| self.replicationReplayTasks[i] = Task.Factory.StartNew( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might have missed it, but we need a way to cancel these tasks in the event the replication stream breaks (i.e. failover).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the signal for that? Looking at TryFailover and nothing is jumping out - seems like AofProcessor (and its associated RespServerSession) is kept live as long as the ClusterProvider is.
Since the AOF stream would be interrupted during failover, I'm not sure aborting the task is actually necessary - they'll just stay pending until/unless a new AOF stream is received.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a cancellationToken in ReplicaReplayTask.cs. This was not passed down to ProcessAofRecordInternal because we didn't have any background tasks in the AofProcessor.cs (as far as I know).
So, you may use that.
…artially deleted Vector Set
…s that cleanup of elements happens in all cases
…to vectorApiPoC
…eaned up into a utility type
…ask is stopped, still needs some explicit tests
| } | ||
| else | ||
| { | ||
| resetEvent.Reset(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we should protect resetEvent.Reset() with some kind of lock as well, if we're creating a more generic CountingEventSlim.
Maybe it's a scenario too extreme, but what if (let me know if it makes sense or maybe I'm missing something):
Initial conditions:
- count = 1
- resetEvent is signaled, so it's blocking operations from running
- a background thread is actively decrementing count
Serialized execution:
- [Thread1]: Calls Increment, executes line 38 and increments count to 2, addRes = 2, then it suspends
- [BackgroundThread] Decrements 2 times, getting count to zero, then does the CompareExchange to set the value and calls
resetEvent.Set() - [Thread1] Goes out of suspension, it still sees addRes = 2, so it will enter the
elseblock and will callresetEvent.Reset(). Now even though count == 0,resetEventwill block execution
This problem doesn't happen with the AOF VADD processing logic because we complete CountingEventSlim.Increment before we add work for the BackgroundThreads, so they can't decrement to 0 in the middle of CountingEventSlim.Increment. But if CountingEventSlim is used in the future with something else, maybe a 'deadlock' could happen, right?
Implement a small subset of Vector Set functionality, as a base for future full support.
This PR:
VectorSetbit toRecordInfoto allow distinguishing Vector Sets from other keysVADD,VREM,VEMB, andVSIMXB8- which allows passing byte-values without conversion, joiningFP32(which is used for little-endianfloats)XPREQ8- a quantization option which takes in pre-quantized vectors, meant for use withXB8EnableVectorSetPreview/--enable-vector-set-previewA more complete write up of the design, justifications, and remaining work is in vector-set.md.
There is still a lot of work to be done on Vector Sets, but this PR is at a point where it's functional enough to be played with - and there's merit to merging it so other work (Store V2, multi-threaded replication, etc.) isn't complicated.
The "big" things (besides commands and options) that are missing are:
XPREQ8quantizers - implementation here is on DiskANN