Ich habe eine Liste r die als Elemente Listen enthält.
Code: Alles auswählen
r = [["Test", "A", "B01", 828288, 1, 7, 'C', 5],
["Test", "A", "B01", 828288, 1, 7, 'T', 6],
["Test", "A", "B01", 171878, 3, 8, 'C', 5],
["Test", "A", "B01", 171878, 3, 8, 'T', 6],
["Test", "A", "B01", 871963, 3, 9, 'A', 5],
["Test", "A", "B01", 871963, 3, 9, 'G', 6],
["Test", "A", "B01", 1932523, 1, 10, 'T', 4],
["Test", "A", "B01", 1932523, 1, 10, 'A', 5],
["Test", "A", "B01", 1932523, 1, 10, 'X', 6],
["Test", "A", "B01", 667214, 1, 14, 'T', 4],
["Test", "A", "B01", 667214, 1, 14, 'G', 5],
["Test", "A", "B01", 667214, 1, 14, 'G', 6]]
Jede Liste i aus r kann Elemente haben (i[0] - i[5]) welche identisch zu andren Listen sein können. Wenn i[0] - i[5] bereits vorhanden ist append nur i[6] and i[7] zur chr. Als Lösung versuche ich ein dict/json zu bekommen wie unten dargestellt:
Code: Alles auswählen
{
"name":"A",
"pos":828288,
"s_type":1,
"sub_name":"B01",
"type":"Test",
"x_type":7,
"chr":[
{
"letter":"C",
"no":4
},
{
"letter":"C",
"no":5
},
{
"letter":"T",
"no":6
}
]
}{
"name":"A",
"pos":171878,
"s_type":3,
"sub_name":"B01",
"type":"Test",
"x_type":8,
"chr":[
{
"letter":"C",
"no":5
},
{
"letter":"T",
"no":6
}
]
}{
"name":"A",
"pos":871963,
"s_type":3,
"sub_name":"B01",
"type":"Test",
"x_type":9,
"chr":[
{
"letter":"A",
"no":5
},
{
"letter":"G",
"no":6
}
]
}{
"name":"A",
"pos":1932523,
"s_type":1,
"sub_name":"B01",
"type":"Test",
"x_type":10,
"chr":[
{
"letter":"T",
"no":4
},
{
"letter":"A",
"no":5
},
{
"letter":"X",
"no":6
}
]
}{
"name":"A",
"pos":667214,
"s_type":1,
"sub_name":"B01",
"type":"Test",
"x_type":14,
"chr":[
{
"letter":"T",
"no":4
},
{
"letter":"G",
"no":5
},
{
"letter":"G",
"no":6
}
]
}
Code: Alles auswählen
from collections import defaultdict
from pprint import pprint
if __name__ == "__main__":
r = [["Test", "A", "B01", 828288, 1, 7, 'C', 5],
["Test", "A", "B01", 828288, 1, 7, 'T', 6],
["Test", "A", "B01", 171878, 3, 8, 'C', 5],
["Test", "A", "B01", 171878, 3, 8, 'T', 6],
["Test", "A", "B01", 871963, 3, 9, 'A', 5],
["Test", "A", "B01", 871963, 3, 9, 'G', 6],
["Test", "A", "B01", 1932523, 1, 10, 'T', 4],
["Test", "A", "B01", 1932523, 1, 10, 'A', 5],
["Test", "A", "B01", 1932523, 1, 10, 'X', 6],
["Test", "A", "B01", 667214, 1, 14, 'T', 4],
["Test", "A", "B01", 667214, 1, 14, 'G', 5],
["Test", "A", "B01", 667214, 1, 14, 'G', 6]]
s = defaultdict(lambda:
defaultdict(lambda:
defaultdict(lambda:
defaultdict(lambda:
defaultdict(lambda:
defaultdict(lambda:
defaultdict(defaultdict)))))))
for i in r:
s["hello":i[0]][i[1]][i[2]][i[3]][i[4]][i[5]][i[6]] = i[7]
pprint(s)
P.S. Später möchte ich die Werte aus einer Datei "line by line" lesen (77GB).