I have a function which runs in a loop performing calculations on a list of arrays.
At a point during the first iteration of the function, a polars lazyframe is initialised.
On the following iterations, a new dataframe is specified using the same schema, and the two dataframes are joined row-wise using pl.vstack, and then specified as a lazyframe again.
import numpy as npimport polars as pldef my_func(): array_list = [np.zeros((1,19))]*2 #this is just for example and not representative of shape of real array. for i, _ in enumerate(array_list): #calculations are done here result = np.zeros((1,19)) #result of calculations (correct shape of real result) if i < 1: result_df = pl.DataFrame(data = result, schema = {'MDL','MVL','MWVL','RR','DET','ADL','LDL','DIV','EDL','LAM','TT','LVL','EVL','AWVL','LWVL','LWVLI','EWVL','Ratio_DRR','Ratio_LD'}, orient='row').lazy() else: new_df = pl.DataFrame(data=data, schema = {'MDL','MVL','MWVL','RR','DET','ADL','LDL','DIV','EDL','LAM','TT','LVL','EVL','AWVL','LWVL','LWVLI','EWVL','Ratio_DRR','Ratio_LD' }, orient='row') #append new dataframe to results result_df = result_df.collect().vstack(new_df, in_place=True).lazy()return result_dfWhen returning the dataframe outside the function, the column names are no longer in order, but the data is.
e.g
result.schemaOrderedDict([('LAM', Float64), ('LDL', Float64), ('ADL', Float64), ('DIV', Float64), ('MDL', Float64), ('MWVL', Float64), ('LWVL', Float64), ('MVL', Float64), ('TT', Float64), ('DET', Float64), ('RR', Float64), ('EDL', Float64), ('Ratio_LD', Float64), ('Ratio_DRR', Float64), ('LVL', Float64), ('LWVLI', Float64), ('EWVL', Float64), ('EVL', Float64), ('AWVL', Float64)])I assume this is due to my naivety about how lazyframes work, but is there a way to enfore the order without renaming the columns?
Thanks.