python - How to convert nested dictionary to specific csv format for use in outside program? (visualization of Markov Network) -

Having issues with getting nested dictionary (ngrams) to output to specific format for use in diff program (cytoscape if anybody is curious lol). I need to preserve the values of the dictionary because I'm using a hidden markov model as the basis for a small language model that generates bad poetry, and I need to show the network that the slm is moving through for reasons.

My brain is too dead at the moment to properly explain a markov model/markov chain, but essentially it looks at the order of a system of 'randomly' occurring events, and gives the probability of specific events happening consecutively.

Currently print(ngrams) outputs as follows:

{'a': {'b': 1, 'c': 2, 'd': 1}, 'b': {'a': 3}, 'c': {'b': 2, 'a': 2}, 'd': {'d': 1, 'b': 1}}

The nested dictionary names are the initial word (source node), the keys are the words that follow the source node (target node), and the values are the number of times the interaction occurs (number of duplicates). In the end csv, all interactionTypes will be

The content of the csv file must be as follows to be read by my network mapping program:

sourceNode,interactionType,targetNode
a,<interactsWith>,b
a,<interactsWith>,c
a,<interactsWith>,c
a,<interactsWith>,d
b,<interactsWith>,a
b,<interactsWith>,a
b,<interactsWith>,a
c,<interactsWith>,b
c,<interactsWith>,b
c,<interactsWith>,a
c,<interactsWith>,a
d,<interactsWith>,d
d,<interactsWith>,b

first tried to output each of the source nodes as a list (aka names of subdictionaries as list), expected output: ['a', 'b', 'c', 'd']

wowChart = {
    'a' : {
        'b' : 1,
        'c' : 2,
        'd' : 1
    },
    'b' : {
        'a' : 3 
    },
    'c' : {
        'b' : 2,
        'a' : 2
    },
    'd' : {
        'd' : 1,
        'b' : 2
    }

}

sourceNode = list(wowChart.items())
print(sourceNode)

actual output:

[('a', {'b': 1, 'c': 2, 'd': 1}), ('b', {'a': 3}), ('c', {'b': 2, 'a': 2}), ('d', {'d': 1, 'b': 2})]

I kind of knew this wouldn't work, next step is for x, obj in wowChart.items(): and go from there.

Not sure why I'm first trying to make lists of names, keys and values to then recombine them to get desired output, theres almost definitely better ways to do this but my mind is not working. probs a skill issue tbh, i'm new to programming and haven't used pandas yet.

Currently print(ngrams) outputs as follows:

{'a': {'b': 1, 'c': 2, 'd': 1}, 'b': {'a': 3}, 'c': {'b': 2, 'a': 2}, 'd': {'d': 1, 'b': 1}}

The content of the csv file must be as follows to be read by my network mapping program:

sourceNode,interactionType,targetNode
a,<interactsWith>,b
a,<interactsWith>,c
a,<interactsWith>,c
a,<interactsWith>,d
b,<interactsWith>,a
b,<interactsWith>,a
b,<interactsWith>,a
c,<interactsWith>,b
c,<interactsWith>,b
c,<interactsWith>,a
c,<interactsWith>,a
d,<interactsWith>,d
d,<interactsWith>,b

first tried to output each of the source nodes as a list (aka names of subdictionaries as list), expected output: ['a', 'b', 'c', 'd']

wowChart = {
    'a' : {
        'b' : 1,
        'c' : 2,
        'd' : 1
    },
    'b' : {
        'a' : 3 
    },
    'c' : {
        'b' : 2,
        'a' : 2
    },
    'd' : {
        'd' : 1,
        'b' : 2
    }

}

sourceNode = list(wowChart.items())
print(sourceNode)

actual output:

[('a', {'b': 1, 'c': 2, 'd': 1}), ('b', {'a': 3}), ('c', {'b': 2, 'a': 2}), ('d', {'d': 1, 'b': 2})]

I kind of knew this wouldn't work, next step is for x, obj in wowChart.items(): and go from there.

Share Improve this question asked Mar 17 at 22:17 ranchh 212 bronze badges

Add a comment |

2 Answers 2

Sorted by: Reset to default 1

My first instinct would be to use a nested enumeration of sorts, which Python is good for. I don't explicitly use enumerate() in the below code because I don't believe you are interested in the indices for your target and source Nodes.

import csv

ngrams = {
    "a": {"b": 1, "c": 2, "d": 1},
    "b": {"a": 3},
    "c": {"b": 2, "a": 2},
    "d": {"d": 1, "b": 1},
}

with open("my_csv.csv", "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerow(["sourceNode", "interactionType", "targetNode"])
    # Iterate over sourceNode keys and interactions
    for key, di in ngrams.items():
        # Iterate over targetNode keys and interaction numbers
        for kj, vj in di.items():
            for k in range(vj):
                # Writing row using desired sourceNode, interactionType, targetNode format
                writer.writerow([key, "<interactsWith>", kj])

This assumes you are using Python 3.x. If you're using Python 2.x, then change 'in ngrams.items()' to 'in ngrams.iteritems()' and change 'in di.items()' to 'in di.iteritems()'.

Here is how I approach it

Code:

import csv

ngrams = {
    "a": {"b": 1, "c": 2, "d": 1},
    "b": {"a": 3},
    "c": {"b": 2, "a": 2},
    "d": {"d": 1, "b": 1},
}

with open("/tmp/out.csv", "w") as buf:
    writer = csv.writer(buf)
    writer.writerow(["sourceNode", "interactionType", "targetNode"])

    for source, targets in ngrams.items():
        for target, count in targets.items():
            rows = [[source, "<interactsWith>", target]] * count
            writer.writerows(rows)

Notes

The algorithm is simple and straight forward
The line rows = [[ ... ]] * count creates count identical rows

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

python - How to convert nested dictionary to specific csv format for use in outside program? (visualization of Markov Network) -

2 Answers 2

与本文相关的文章

评论列表(0)