最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - How does this algo cuttree hackerrank work? - Stack Overflow

programmeradmin4浏览0评论

Given a tree T with n nodes, how many subtrees (T’) of T have at most K edges connected to (T – T’)? Input Format

The first line contains two integers n and K followed by n-1 lines each containing two integers a & b denoting that there’s an edge between a & b.

Constraints

1 <= K <= n <= 50 Every node is indicated by a distinct number from 1 to n.

Output Format

A single integer which denotes the number of possible subtrees.

def cutTree(n, k, edges):
    g = [[] for _ in range(n)]
    for i in edges:
        g[i[0]-1].append(i[1]-1)
        g[i[1]-1].append(i[0]-1)
    global ans
    ans = 1
    def multiply(x, y):
        ans = defaultdict(lambda: 0)
        for k, v in x.items():
            for k1, v1 in y.items(): ans[k+k1-1] += v*v1
        for k, v in ans.items():
            if k in x: x[k] += v
            else: x[k] = v
    def dfs(i,p):
        global ans
        if g[i] == [p]:
            ans += 1
            return {0:1}
        x = 1 if i else 0
        res = {len(g[i])-x : 1}
        for nxt in g[i]:
            if nxt != p: multiply(res, dfs(nxt, i))
        ans += sum(((v if i+x <= k else 0) for i, v in res.items()))
        return res
    dfs(0,-1)
    return ans

It solves but how?

Given a tree T with n nodes, how many subtrees (T’) of T have at most K edges connected to (T – T’)? Input Format

The first line contains two integers n and K followed by n-1 lines each containing two integers a & b denoting that there’s an edge between a & b.

Constraints

1 <= K <= n <= 50 Every node is indicated by a distinct number from 1 to n.

Output Format

A single integer which denotes the number of possible subtrees.

def cutTree(n, k, edges):
    g = [[] for _ in range(n)]
    for i in edges:
        g[i[0]-1].append(i[1]-1)
        g[i[1]-1].append(i[0]-1)
    global ans
    ans = 1
    def multiply(x, y):
        ans = defaultdict(lambda: 0)
        for k, v in x.items():
            for k1, v1 in y.items(): ans[k+k1-1] += v*v1
        for k, v in ans.items():
            if k in x: x[k] += v
            else: x[k] = v
    def dfs(i,p):
        global ans
        if g[i] == [p]:
            ans += 1
            return {0:1}
        x = 1 if i else 0
        res = {len(g[i])-x : 1}
        for nxt in g[i]:
            if nxt != p: multiply(res, dfs(nxt, i))
        ans += sum(((v if i+x <= k else 0) for i, v in res.items()))
        return res
    dfs(0,-1)
    return ans

It solves https://www.hackerrank/challenges/cuttree/forum but how?

Share Improve this question edited yesterday Ken Y-N 15k21 gold badges77 silver badges121 bronze badges asked yesterday IrinaIrina 1,2972 gold badges16 silver badges38 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 1

Here's an explanation I can give you (if you need clarification do let me know).

Main idea

Since a subtree T′ of T is a connected set of nodes.
Imagine extending such a subtree starting from some node and then deciding, for each adjacent node (child in the DFS tree), whether we are attaching it or not.
When you do not attach a branch, that edge is cut.

Observations

  1. When you attach a subtree from a child, the edge that connects the child to the current node goes from cut to attached.
  2. When combining choices from different branches, you need to subtract 1 for each branch that you attach (because you save a cut).

Dynamic Programming

The code uses dynamic programming to effectively count the number of different branches (yes despite what the Hackerrank forum/discussion member claims, this code is using an underlying dp).

Currently the code creates a dictionary dp table where the keys are the number of cut edges that the subtree (that includes the current node) has, and the values are the number of ways to achieve that.

Note that the "multiply" function combines dp states. For example:
a). Suppose you have a current configuration with k (edgeCount1) boundary edges (from the current node’s partial subtree) and a child subtree that can be attached in ways that yield k1 (edgeCount2) boundary edges. (notice again the observation that we are saving 1 cut)
2. This convolution over the possible numbers of boundary edges may be considered analogous to multiplying two polynomials (where the exponents track the number of boundary edges) with a shift of −1.

What makes this code tricky

One thing that the author did to make the code so hard to understand is they used variables with poor names, and in some cases names that really distracted from the main idea. Because of this I've included an updated code (one that simplifies part of the logic, but also names the variables in a way to make it a bit easier to understand the code):

def cutTree(n, k, edges):
    graph = [[] for _ in range(n)]
    #We are creating an undirected graph from the edges
    for i in edges:
        graph[i[0]-1].append(i[1]-1)
        graph[i[1]-1].append(i[0]-1)
    global ans
    ans = 1
    #previously called multiply
    def combine(dp, dp2):
        counts = Counter()
        for edgeCount1, v in dp.items():
            #k (or edgeCount1) is the edgeCount
            #v is the number of times we can count/achieve that edgeCount
            for edgeCount2, v2 in dp2.items():
                #You'll remember that y (dp2) is also part of dp
                #k1 (edgeCount2) is edgeCount
                #v1 (v2) is the number of times we can count/achieve that edgeCount
                #we multiply to get the combinations
                #again the observation is that we save 1 cut edge when combining
                counts[edgeCount1+edgeCount2-1] += v*v2
        for edgeCount, v in counts.items():
            #add combinations to the edgeCount (k)
            dp[edgeCount]+=v
    def dfs(node,parent):
        global ans
        hasParent = 1 if node else 0
        #x is telling us if we have a parent.
        edgeCount = len(graph[node])-hasParent # Number of edges that aren't the parent.
        dp = Counter()
        dp[edgeCount] = 1 #edges: 1
        for nxt in graph[node]:
            #nxt not being equal to parent makes sure we don't revisit the parent.
            if nxt != parent: 
                combine(dp, dfs(nxt, node))
        #If i (the edge count) + hasParent (which is 1 if there is a parent, and 0 otherwise)
        #In other words if the total edge count is less than or equal to  k (i.e. at most K edges)
        #We are going to add the number of times that combination happens (v), 
        #We discard edge counts that are too large
        ans += sum(((v if i+hasParent <= k else 0) for i, v in dp.items()))
        return dp
    dfs(0,-1) #start with node 0 parent -1 (no parent)
    return ans

My explanation may come up short (as far as simplifying it goes), and I encourage others to post their own answers explaining the code.
Hopefully my example code with the adjusted variable names clarifies a bit of what the code is doing

(though I do want to note I did remove the base case in the dfs as it wasn't necessary... Perhaps that was a mistake as far as the explanation goes (as it might be easier to see the other, or you can try to understand that as well)...

Again I encourage others to post their explanations, feel free to leave a comment if you don't understand part of it.

发布评论

评论列表(0)

  1. 暂无评论