hardSliding WindowAI-applied~40 min
Count Context Spans With Exactly K Distinct Tokens
A tokenizer emits a stream of token ids. To audit diversity, count how many contiguous spans contain exactly k distinct token ids — a building block for windowed vocabulary metrics.
Problem
Given an integer array tokens and an integer k, count the number of contiguous subarrays that contain exactly k distinct values. Return that count.
Input
An integer array tokens and an integer k.
Output
An integer: the count of subarrays with exactly k distinct values.
Constraints
- 1 <= tokens.length <= 2 * 10^4
- 1 <= tokens[i] <= tokens.length
- 1 <= k <= tokens.length
Examples
Example 1
Input
tokens = [1,2,1,2,3], k = 2
Output
7
Seven subarrays contain exactly two distinct token ids.
Example 2
Input
tokens = [1,2,1,3,4], k = 3
Output
3
[1,2,1,3], [2,1,3], [1,3,4] each have exactly three distinct ids.