Ich benötige eine Matrix die wie folgt definiert https://github.com/d3/d3-chord ist:
Computes the chord layout for the specified square matrix of size n×n, where the matrix represents the directed flow amongst a network (a complete digraph) of n nodes. The given matrix must be an array of length n, where each element matrix is an array of n numbers, where each matrix[j] represents the flow from the ith node in the network to the jth node. Each number matrix[j] must be nonnegative, though it can be zero if there is no flow from node i to node j.
Code: Alles auswählen
var matrix = [
[11975, 5871, 8916, 2868],
[ 1951, 10048, 2060, 6171],
[ 8010, 16145, 8090, 8045],
[ 1013, 990, 940, 6907]
];
Ich habe folgende Eingabedatei:
Code: Alles auswählen
tig00007144 chr03 1 5 1 5
tig00026480 chr03 10 15 10 15
tig00003221 chr03 7 9 12 14
tig00010111 chr03 9 12 17 20
tig00000318 chr03 15 20 15 20
Und die Matrix wurde wahrscheinlich wie folgt aussehen:
Die leeren Matrix Felder müssten mit Null gefüllt werden.
Im Moment habe ich diesen Code geschrieben:
Code: Alles auswählen
from collections import OrderedDict
import json
import numpy
def create_matrix(filename, assembly_len, reference_len):
dimension = 0
if (assembly_len > reference_len) or (assembly_len == reference_len):
dimension = assembly_len
else:
dimension = reference_len
matrix = numpy.zeros(shape=(dimension,dimension))
print matrix
with open(filename) as f:
for line in f:
try:
parts = line.rstrip().split('\t')
query_name = parts[0]
subject_name = parts[1]
query_start = int(parts[2])
query_end = int(parts[3])
subject_start = int(parts[4])
subject_end = int(parts[5])
query_elements = range(query_start -1, query_end)
subject_elements = range(subject_start -1, subject_end)
print query_elements
print subject_elements
except ValueError:
pass
if __name__ == "__main__":
assembly_len = 20
reference_len = 30
create_matrix('blast_test', assembly_len, reference_len)
Vielen Dank im Voraus.