I have text files from an external source that are formatted like so:
0 0 -0.105961 0.00000 1 0 -1.06965 0.00000 1 1 -0.0187213 -0.240237 2 0 -0.124695 0.00000 2 1 -0.178982 0.0633255 2 2 0.760988 -0.213796 3 0 -1.96695 0.00000 3 1 0.0721285 0.0491248 3 2 -0.560517 0.267733 3 3 -0.188732 -0.112053 4 0 -0.0205364 0.00000⋮⋮⋮⋮ 40 30 0.226833 -0.733674 40 31 0.0444837 -0.249677 40 32 -0.171559 -0.970601 40 33 -0.141848 -0.137257 40 34 -0.247042 -0.902128 40 35 -0.495114 0.322912 40 36 0.132215 0.0543294 40 37 0.125682 0.817945 40 38 0.181098 0.223309 40 39 0.702915 0.103991 40 40 1.11882 -0.488252
where the first two columns are the indices of a 2d array (say i
and j
), and the 3rd and 4th columns are the values for two 2d arrays (say p[:,:]
and q[:,:]
). What would be the idiomatic way in Python/Julia to read this into two 2d arrays?
There are some assumptions we can make: the arrays are lower triangular (i.e., the values only exist (or are nonzero) for j <= i
), and the indices are increasing, that is, the last line can be expected to have the largest i
(and probably j
).
The current implementation assumes that maximum i
and j
is 40, and proceeds like so: (Julia minimal working example):
using OffsetArraysusing DelimitedFiles: readdlmn = 40p = zeros(0:n, 0:n)q = zeros(0:n, 0:n)open(filename) do infile for i = 0:n for j = 0:i line = readline(infile) arr = readdlm(IOBuffer(line)) p[i,j] = arr[3] q[i,j] = arr[4] end endend
Note that this example also assumes that the index j
changes the fastest, which is usually true, but in doing that it effectively ignores the first two columns of the data.
However, I'm looking for a solution that makes fewer assumptions about the data file (maybe only the lower-triangular one and the index increasing one). What would be a "natural" way of doing this in Julia or Python?