hardBinary SearchAI-applied~48 min
Longest Repeated Span in a Training Corpus
Before fine-tuning, you deduplicate a concatenated text corpus. Find the longest substring that appears at least twice (occurrences may overlap) so you can flag and trim redundant spans.
Problem
Given a string s, find the longest substring that occurs at least twice within s. The two occurrences may overlap. Return any one such longest duplicated substring, or an empty string if no character repeats.
Input
A string s of length [1, 3*10^4], lowercase English letters.
Output
A string: a longest substring appearing at least twice (or "").
Constraints
- 2 <= s.length <= 3 * 10^4
- s consists of lowercase English letters.
Examples
Example 1
Input
s = "banana"
Output
"ana"
'ana' occurs at indices 1 and 3 (overlapping).
Example 2
Input
s = "abcd"
Output
""
No repeated substring.