Today I tackled an interesting Leetcode problem: Finding the longest substring without any duplicating characters. This problem introduced me to the dynamic sliding window technique a powerful tool that I can add to my arsenal.
I implemented the solution in which I used a set to track distinct characters and two pointers l and r. When encountering a duplicate, I made the program shrink from the left to the right and removing them with a while loop.
My solution runs in O(n) but it took 6558ms, which is inefficient. There must be a way to optimize it further. Any suggestions?
class Solution(object):
def lengthOfLongestSubstring(self, s):
"""
:type s: str
:rtype: int
"""
longest = set()
l = 0
max_num = 0
# "abcabcbb"
for r in range(len(s)):
# add the s[r] if not in arr
while s[r] in longest:
longest.remove(s[l])
l += 1
if s[r] not in longest:
longest.add(s[r])
max_num = max(max_num, (r-l) + 1)
print(longest)
print(max_num)
continue
return max_num
This problem helped me understand choosing the right data structure and technique to use, and how to expand and shrink a sliding window. Have you encountered this problem before? How would you optimize it?
Today I tackled an interesting Leetcode problem: Finding the longest substring without any duplicating characters. This problem introduced me to the dynamic sliding window technique a powerful tool that I can add to my arsenal.
I implemented the solution in which I used a set to track distinct characters and two pointers l and r. When encountering a duplicate, I made the program shrink from the left to the right and removing them with a while loop.
My solution runs in O(n) but it took 6558ms, which is inefficient. There must be a way to optimize it further. Any suggestions?
class Solution(object):
def lengthOfLongestSubstring(self, s):
"""
:type s: str
:rtype: int
"""
longest = set()
l = 0
max_num = 0
# "abcabcbb"
for r in range(len(s)):
# add the s[r] if not in arr
while s[r] in longest:
longest.remove(s[l])
l += 1
if s[r] not in longest:
longest.add(s[r])
max_num = max(max_num, (r-l) + 1)
print(longest)
print(max_num)
continue
return max_num
This problem helped me understand choosing the right data structure and technique to use, and how to expand and shrink a sliding window. Have you encountered this problem before? How would you optimize it?
Share Improve this question edited 2 days ago ThomasIsCoding 103k9 gold badges37 silver badges102 bronze badges asked Apr 2 at 4:42 Emmanuel RusianaEmmanuel Rusiana 394 bronze badges 3 |4 Answers
Reset to default 5That’s an interesting problem. As mentioned in the comments already, I’d start by removing print statements.
Your intuition using sets is great, but this element lookup in set() inside a while loop, seems to be giving a lot of overhead.
Instead, you could use a heap or a hash map, to store last positions of each character and then look them up as you go further into the array.
You could even use default dict to initialise the char_index.
class Solution(object):
def lengthOfLongestSubstring(self, s):
char_index = {} # hashmap to store last positions of each character
max_length = 0
l = 0
for r, char in enumerate(s):
if char in char_index and char_index[char] >= l:
# Move left pointer just after the duplicate's previous index
l = char_index[char] + 1
# Update or add the character's latest index
char_index[char] = r
# Update maximum length found so far
max_length = max(max_length, r - l + 1)
return max_length
Some improvements:
Don't increase
l
with 1, but instead remember where the current character occurred and setl
to the index right after it. This way you can eliminate the inner loop. This is what also @kris-szczepaniak has pointed out, suggesting to use a dict.As the code challenge limits the character range to "English letters, digits, symbols and spaces", which suggests the characters are within the ASCII range, you could use a fixed-size list instead of a set/dict, where the index into that list corresponds to the ASCII code of the character.
Here is how that could look:
def lengthOfLongestSubstring(self, s: str) -> int:
# List indices are ASCII codes. List values are indices in s,
# after last occurrence of character that has the ASCII code.
indexafter = [0] * 128 # ASCII range
size = 0
start = 0
for stop, asc in enumerate(map(ord, s), 1): # get index *after* current character
if indexafter[asc] > start: # character already occurs within window
start = indexafter[asc] # shrink window: start after previous occurrence
indexafter[asc] = stop # update index to after the current occurrence
size = max(size, stop - start)
return size
As noted in comments, you should remove the print() statements as well as the unnecessary test after the while loop.
Which breaks it down to this:
class Solution:
def lengthOfLongestSubstring(self, s: str) -> int:
left = result = 0
observed = set()
for right, c in enumerate(s):
while c in observed:
observed.remove(s[left])
left += 1
observed.add(c)
if (r := right - left + 1) > result:
result = r
return result
You can use the sliding window algorithm in this way:
class Solution:
def lengthOfLongestSubstring(self, s: str) -> int:
# Initialize set st, n as sliding wondow pointer, res to store result
st = set()
n = 0
res = 0
# Iterate through the string s with i as right pointer
for i in range(len(s)):
# Handle repeating character
while s[i] in st:
st.remove(s[n])
n += 1
# Add current character to set
st.add(s[i])
# Update result
res = max(res, i - n + 1)
# Return result
return res
Hope this helps
if s[r] not in longest
? Thewhile
loop before this ensures that that's true. – Barmar Commented Apr 2 at 5:11print
calls? – user2357112 Commented Apr 2 at 5:17