最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - How to loop through all distinct triplets of an array such that they are of the format (a, b, b)? Length of array &

programmeradmin3浏览0评论

As stated above, I need to efficiently count the number of distinct triplets of the form (a, b, b). In addition, the triplet is only valid if and only if it can be formed by deleting some integers from the array, only leaving behind that triplet in that specific ordering. What this is saying is that the triplets need to be in chronological order, I believe, but don't have to consist of consecutive elements. The solution needs to be really efficient as N (the length of the array) can go upto 10^6 (or a million). For example, if the array was [5, 6, 7, 3, 3, 3], then the answer would be 3 as the triplets would be: (5, 3, 3), (6, 3, 3), and (7, 3, 3).

This was my first brute force (just to start off, O(n^3)):

n = int(input())
arr = list(map(int, input().split()))

ans = set()
for i in range(n):
    for j in range(i + 1, n):
        if arr[i] != arr[j]:
            for k in range(j + 1, n):
                if arr[j] == arr[k]:
                    ans.add((arr[i], arr[j], arr[k]))

print(len(ans))

Then, I unsuccessfully tried optimizing this to an O(n^2), which is still too slow, but I can't even seem to get this right:

def solve():
    n = int(input())
    arr = list(map(int, input().split()))

    freq = Counter(arr)
    ans = set()
    for a in freq:
        if freq[a] < 1:
            continue
        for b in freq:
            if b != a and freq[b] >= 2:
                ans.add((a, b, b))

    return len(ans)


print(solve())

I can't fix the logic for the O(n^2) and optimize this further to fully solve the problem under the given constraints. Assistance would be much appreciated.

As stated above, I need to efficiently count the number of distinct triplets of the form (a, b, b). In addition, the triplet is only valid if and only if it can be formed by deleting some integers from the array, only leaving behind that triplet in that specific ordering. What this is saying is that the triplets need to be in chronological order, I believe, but don't have to consist of consecutive elements. The solution needs to be really efficient as N (the length of the array) can go upto 10^6 (or a million). For example, if the array was [5, 6, 7, 3, 3, 3], then the answer would be 3 as the triplets would be: (5, 3, 3), (6, 3, 3), and (7, 3, 3).

This was my first brute force (just to start off, O(n^3)):

n = int(input())
arr = list(map(int, input().split()))

ans = set()
for i in range(n):
    for j in range(i + 1, n):
        if arr[i] != arr[j]:
            for k in range(j + 1, n):
                if arr[j] == arr[k]:
                    ans.add((arr[i], arr[j], arr[k]))

print(len(ans))

Then, I unsuccessfully tried optimizing this to an O(n^2), which is still too slow, but I can't even seem to get this right:

def solve():
    n = int(input())
    arr = list(map(int, input().split()))

    freq = Counter(arr)
    ans = set()
    for a in freq:
        if freq[a] < 1:
            continue
        for b in freq:
            if b != a and freq[b] >= 2:
                ans.add((a, b, b))

    return len(ans)


print(solve())

I can't fix the logic for the O(n^2) and optimize this further to fully solve the problem under the given constraints. Assistance would be much appreciated.

Share Improve this question edited Jan 25 at 22:15 no comment 10.1k5 gold badges20 silver badges40 bronze badges asked Jan 25 at 18:13 vijaysrinivasan Thirumalaivijaysrinivasan Thirumalai 253 bronze badges New contributor vijaysrinivasan Thirumalai is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct. 4
  • Count the number of occurrences for each number. Split the numbers in 2 groups: the one with >=2 occurrence and the one with 1. The ones that can be at the end are only in the first group. For the chronological order, you can sort the values in each group. You need to count items, not to iterate on all of them so this reduce a bit the number of operations. In the end, it should be in done in O(n log n) time. IDK if O(n) is possible. You can use a radix sort to do the sort in O(n) but then the rest of the algorithm is not easy to do in O(n) (impossible?). O(n log n) is good already. – Jérôme Richard Commented Jan 25 at 20:30
  • I don't see a restriction that says a can't be b, so if the array was [5, 6, 7, 3, 3, 3], then the answer would be 4, not 3, as the triplets would be (5,3,3), (6,3,3), (7,3,3), and (3,3,3). – Mike 'Pomax' Kamermans Commented Jan 25 at 22:16
  • @Mike'Pomax'Kamermans I first wondered about that, too, but then that example and especially their solution made it clear that they must differ. – no comment Commented Jan 25 at 22:27
  • Then it'll be a good idea for @vijaysrinivasanThirumalai to update their post. – Mike 'Pomax' Kamermans Commented Jan 25 at 23:44
Add a comment  | 

3 Answers 3

Reset to default 1

My idea is to break it down into three tasks:

  1. Find the last two occurrences of duplicate values.
  2. Check how many unique values exist to the left of those occurrences.
  3. Check whether the value found in the 1. is included among those unique values.

And here is a two-pass solution with O(n) memory consumption. The worst case time complexity is O(n^2), but since it is O(n) x dictionary accesses, it should be quite efficient.

def count_from_right(arr):
    # A mapping that maps each distinct value to its first occurrence index.
    first_occurrence_index_map = {}

    # A mapping where unique_value_counts[i] represents the number of distinct values in arr[:i].
    unique_value_counts = [0]

    for i, value in enumerate(arr):
        if value not in first_occurrence_index_map:
            first_occurrence_index_map[value] = i
        unique_value_counts.append(len(first_occurrence_index_map))

    # A mapping where count[k] represents the number of triples of the form (*, k, k).
    ans_counts = {}

    for i in range(len(arr))[::-1]:  # Iterate from right to left.
        if arr[i] not in ans_counts:
            # If this is the last occurrence of arr[i], mark it with 0.
            ans_counts[arr[i]] = 0
        elif ans_counts[arr[i]] == 0:
            # Here, arr[i] is the second-to-last occurrence of this value.
            # The number of possible triplets is the count of unique values in arr[:i], excluding arr[i] itself.
            ans_counts[arr[i]] = unique_value_counts[i] - int(first_occurrence_index_map[arr[i]] < i)

    return sum(ans_counts.values())

At the second-to-last occurrence of each b-value, add the number of different values that came before it. Takes about 1.5 seconds for array length 10^6.

from collections import Counter

def linear(arr):
    ctr = Counter(arr)
    A = set()
    result = 0
    for b in arr:
        if ctr[b] == 2:
            result += len(A) - (b in A)
        ctr[b] -= 1
        A.add(b)
    return result

Testing your small example and five larger arrays:

import random
from time import time

def original(arr):
    n = len(arr)
    ans = set()
    for i in range(n):
        for j in range(i + 1, n):
            if arr[i] != arr[j]:
                for k in range(j + 1, n):
                    if arr[j] == arr[k]:
                        ans.add((arr[i], arr[j], arr[k]))
    return len(ans)

def test(arr):
    expect = original(arr)
    result = linear(arr)
    print(result == expect, expect, result)

# Correctness
test([5, 6, 7, 3, 3, 3])
for _ in range(5):
    test(random.choices(range(100), k=100))

# Speed
n = 10**6
arr = random.choices(random.sample(range(10**9), n), k=n)
t = time()
print(linear(arr))
print(time() - t)

Sample output (Attempt This Online!):

True 3 3
True 732 732
True 1038 1038
True 629 629
True 754 754
True 782 782
80414828386
1.4968228340148926

Do you want to check only for contiguous triplets of the form (a, b, b)? Based on your example, that doesn't seem to be the case. Then, to get the number of triplets, you can just count the number of unique values and repeated values and multiply them.

In the example [5,6,7,3,3,3],

count_of_unique_vals = 3     # 5, 6 and 7
count_of_repeated_vals = 1   # 3
count_of_triplets = 3        # 3*1, or len(unique_vals) * len(repeated_vals)

与本文相关的文章

发布评论

评论列表(0)

  1. 暂无评论