Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 23160

Data Reconcilition with Python, outof memory error

$
0
0

I want to reconcile by reading data from two different databases. I get an out of memory error while reading the data for a job that will run on a single machine.

I read the data with pandas by taking two different SQL as source and target. I can't try chunksize because I need to pull the same data to reconcile from two different systems.

My main problem is to be able to pull the same data from two different systems in chunks and compare them in a loop.

available code:

smartdq_logging('Log', gv_log_level, f"connecting to source db...")cl_connect_sourcedb = lib.environment_connection.DBConnect()cl_connect_sourcedb.connect_with_environment_id(environmentIdSource, "SMARTDQ", cl_connect_smartdq) smartdq_logging('Log', gv_log_level, f"connecting to target db...")cl_connect_target = lib.environment_connection.DBConnect()cl_connect_target.connect_with_environment_id(environmentIdTarget, "SMARTDQ", cl_connect_smartdq) source_query = """SELECT    LEVEL AS id,    ROUND(DBMS_RANDOM.VALUE(0, 99999) / 100000, 5) AS valueFROM    DUALCONNECT BY    LEVEL <= 1000000"""target_query = """SELECT    LEVEL AS id,    ROUND(DBMS_RANDOM.VALUE(0, 99999) / 100000, 5) AS valueFROM    DUALCONNECT BY    LEVEL <= 1000000"""print('start recon...')df_source = db_func.read_data_to_dataframe(source_query, cl_connect_sourcedb, "original")df_target = db_func.read_data_to_dataframe(source_query, cl_connect_target, "original")

Viewing all articles
Browse latest Browse all 23160

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>