As stated above, I need to efficiently count the number of distinct triplets of the form (a, b, b). In addition, the triplet is only valid if and only if it can be formed by deleting some integers from the array, only leaving behind that triplet in that specific ordering. What this is saying is that the triplets need to be in chronological order, I believe, but don't have to consist of consecutive elements. The solution needs to be really efficient as N (the length of the array) can go upto 10^6 (or a million). For example, if the array was [5, 6, 7, 3, 3, 3], then the answer would be 3 as the triplets would be: (5, 3, 3), (6, 3, 3), and (7, 3, 3).
This was my first brute force (just to start off, O(n^3)):
n = int(input())
arr = list(map(int, input().split()))
ans = set()
for i in range(n):
for j in range(i + 1, n):
if arr[i] != arr[j]:
for k in range(j + 1, n):
if arr[j] == arr[k]:
ans.add((arr[i], arr[j], arr[k]))
print(len(ans))
Then, I unsuccessfully tried optimizing this to an O(n^2), which is still too slow, but I can't even seem to get this right:
def solve():
n = int(input())
arr = list(map(int, input().split()))
freq = Counter(arr)
ans = set()
for a in freq:
if freq[a] < 1:
continue
for b in freq:
if b != a and freq[b] >= 2:
ans.add((a, b, b))
return len(ans)
print(solve())
I can't fix the logic for the O(n^2) and optimize this further to fully solve the problem under the given constraints. Assistance would be much appreciated.
As stated above, I need to efficiently count the number of distinct triplets of the form (a, b, b). In addition, the triplet is only valid if and only if it can be formed by deleting some integers from the array, only leaving behind that triplet in that specific ordering. What this is saying is that the triplets need to be in chronological order, I believe, but don't have to consist of consecutive elements. The solution needs to be really efficient as N (the length of the array) can go upto 10^6 (or a million). For example, if the array was [5, 6, 7, 3, 3, 3], then the answer would be 3 as the triplets would be: (5, 3, 3), (6, 3, 3), and (7, 3, 3).
This was my first brute force (just to start off, O(n^3)):
n = int(input())
arr = list(map(int, input().split()))
ans = set()
for i in range(n):
for j in range(i + 1, n):
if arr[i] != arr[j]:
for k in range(j + 1, n):
if arr[j] == arr[k]:
ans.add((arr[i], arr[j], arr[k]))
print(len(ans))
Then, I unsuccessfully tried optimizing this to an O(n^2), which is still too slow, but I can't even seem to get this right:
def solve():
n = int(input())
arr = list(map(int, input().split()))
freq = Counter(arr)
ans = set()
for a in freq:
if freq[a] < 1:
continue
for b in freq:
if b != a and freq[b] >= 2:
ans.add((a, b, b))
return len(ans)
print(solve())
I can't fix the logic for the O(n^2) and optimize this further to fully solve the problem under the given constraints. Assistance would be much appreciated.
Share Improve this question edited Jan 25 at 22:15 no comment 10.1k5 gold badges20 silver badges40 bronze badges asked Jan 25 at 18:13 vijaysrinivasan Thirumalaivijaysrinivasan Thirumalai 253 bronze badges New contributor vijaysrinivasan Thirumalai is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct. 4 |3 Answers
Reset to default 1My idea is to break it down into three tasks:
- Find the last two occurrences of duplicate values.
- Check how many unique values exist to the left of those occurrences.
- Check whether the value found in the 1. is included among those unique values.
And here is a two-pass solution with O(n) memory consumption. The worst case time complexity is O(n^2), but since it is O(n) x dictionary accesses, it should be quite efficient.
def count_from_right(arr):
# A mapping that maps each distinct value to its first occurrence index.
first_occurrence_index_map = {}
# A mapping where unique_value_counts[i] represents the number of distinct values in arr[:i].
unique_value_counts = [0]
for i, value in enumerate(arr):
if value not in first_occurrence_index_map:
first_occurrence_index_map[value] = i
unique_value_counts.append(len(first_occurrence_index_map))
# A mapping where count[k] represents the number of triples of the form (*, k, k).
ans_counts = {}
for i in range(len(arr))[::-1]: # Iterate from right to left.
if arr[i] not in ans_counts:
# If this is the last occurrence of arr[i], mark it with 0.
ans_counts[arr[i]] = 0
elif ans_counts[arr[i]] == 0:
# Here, arr[i] is the second-to-last occurrence of this value.
# The number of possible triplets is the count of unique values in arr[:i], excluding arr[i] itself.
ans_counts[arr[i]] = unique_value_counts[i] - int(first_occurrence_index_map[arr[i]] < i)
return sum(ans_counts.values())
At the second-to-last occurrence of each b-value, add the number of different values that came before it. Takes about 1.5 seconds for array length 10^6.
from collections import Counter
def linear(arr):
ctr = Counter(arr)
A = set()
result = 0
for b in arr:
if ctr[b] == 2:
result += len(A) - (b in A)
ctr[b] -= 1
A.add(b)
return result
Testing your small example and five larger arrays:
import random
from time import time
def original(arr):
n = len(arr)
ans = set()
for i in range(n):
for j in range(i + 1, n):
if arr[i] != arr[j]:
for k in range(j + 1, n):
if arr[j] == arr[k]:
ans.add((arr[i], arr[j], arr[k]))
return len(ans)
def test(arr):
expect = original(arr)
result = linear(arr)
print(result == expect, expect, result)
# Correctness
test([5, 6, 7, 3, 3, 3])
for _ in range(5):
test(random.choices(range(100), k=100))
# Speed
n = 10**6
arr = random.choices(random.sample(range(10**9), n), k=n)
t = time()
print(linear(arr))
print(time() - t)
Sample output (Attempt This Online!):
True 3 3
True 732 732
True 1038 1038
True 629 629
True 754 754
True 782 782
80414828386
1.4968228340148926
Do you want to check only for contiguous triplets of the form (a, b, b)? Based on your example, that doesn't seem to be the case. Then, to get the number of triplets, you can just count the number of unique values and repeated values and multiply them.
In the example [5,6,7,3,3,3],
count_of_unique_vals = 3 # 5, 6 and 7
count_of_repeated_vals = 1 # 3
count_of_triplets = 3 # 3*1, or len(unique_vals) * len(repeated_vals)
O(n log n)
time. IDK ifO(n)
is possible. You can use a radix sort to do the sort inO(n)
but then the rest of the algorithm is not easy to do inO(n)
(impossible?).O(n log n)
is good already. – Jérôme Richard Commented Jan 25 at 20:30a
can't beb
, so if the array was [5, 6, 7, 3, 3, 3], then the answer would be 4, not 3, as the triplets would be (5,3,3), (6,3,3), (7,3,3), and (3,3,3). – Mike 'Pomax' Kamermans Commented Jan 25 at 22:16