I have two CSV files which are file1.csv, and file2.csv. Both have multiple columns as follows:
a. file1.csv
username,user id,access hash,name,group,group idSreyTey1998,963229606,7854138709318981862,Smaradey Chan,Zisy Ly បោះដុំនឹងលក់រាយកម្មង់ពីរោងចក្រផ្ទាល់📥លáŸážážœáŸážšáž›áž»áž™0967699965,1806798461Srey_Tey_1,2079816779,6921382059939144796,Srey tey,Zisy Ly បោះដុំនឹងលក់រាយកម្មង់ពីរោងចក្រផ្ទាល់📥លáŸážážœáŸážšáž›áž»áž™0967699965,1806798461sreytey123,5316691604,668712126044928206,Phat SreyTey,Zisy Ly បោះដុំនឹងលក់រាយកម្មង់ពីរោងចក្រផ្ទាល់📥លáŸážážœáŸážšáž›áž»áž™0967699965,1806798461Sreytey168,5455045488,-714912998136226691,Vong Soksreytey,Zisy Ly បោះដុំនឹងលក់រាយកម្មង់ពីរោងចក្រផ្ទាល់📥លáŸážážœáŸážšáž›áž»áž™0967699965,1806798461SreyTey99,5653783510,-2575791274366210473,Oun Tey,Zisy Ly បោះដុំនឹងលក់រាយកម្មង់ពីរោងចក្រផ្ទាល់📥លáŸážážœáŸážšáž›áž»áž™0967699965,1806798461sreytey1919,5819100400,3174041461521242292,Tey Tey,Zisy Ly បោះដុំនឹងលក់រាយកម្មង់ពីរោងចក្រផ្ទាល់📥លáŸážážœáŸážšáž›áž»áž™0967699965,1806798461Sreytey6666,6001252515,1586106578669001327,Srey Tey,Zisy Ly បោះដុំនឹងលក់រាយកម្មង់ពីរោងចក្រផ្ទាល់📥លáŸážážœáŸážšáž›áž»áž™0967699965,1806798461SreyTey7777,6026179841,5596849859821333867,Srey Tey,Zisy Ly បោះដុំនឹងលក់រាយកម្មង់ពីរោងចក្រផ្ទាល់📥លáŸážážœáŸážšáž›áž»áž™0967699965,1806798461Ahh_Nak86,5637888996,-1267155033181296023,Yìì Ng,Zisy Ly បោះដុំនឹងលក់រាយកម្មង់ពីរោងចក្រផ្ទាល់📥លáŸážážœáŸážšáž›áž»áž™0967699965,1806798461b. file2.csv
username,user id,access hash,name,group,group idSreyTey1998,963229606,7854138709318981862,Smaradey Chan,Zisy Ly បោះដុំនឹងលក់រាយកម្មង់ពីរោងចក្រផ្ទាល់📥លáŸážážœáŸážšáž›áž»áž™0967699965,1806798461Srey_Tey_1,2079816779,6921382059939144796,Srey tey,Zisy Ly បោះដុំនឹងលក់រាយកម្មង់ពីរោងចក្រផ្ទាល់📥លáŸážážœáŸážšáž›áž»áž™0967699965,1806798461sreytey123,5316691604,668712126044928206,Phat SreyTey,Zisy Ly បោះដុំនឹងលក់រាយកម្មង់ពីរោងចក្រផ្ទាល់📥លáŸážážœáŸážšáž›áž»áž™0967699965,1806798461Sreytey168,5455045488,-714912998136226691,Vong Soksreytey,Zisy Ly បោះដុំនឹងលក់រាយកម្មង់ពីរោងចក្រផ្ទាល់📥លáŸážážœáŸážšáž›áž»áž™0967699965,1806798461SreyTey99,5653783510,-2575791274366210473,Oun Tey,Zisy Ly បោះដុំនឹងលក់រាយកម្មង់ពីរោងចក្រផ្ទាល់📥លáŸážážœáŸážšáž›áž»áž™0967699965,1806798461sreytey1919,5819100400,3174041461521242292,Tey Tey,Zisy Ly បោះដុំនឹងលក់រាយកម្មង់ពីរោងចក្រផ្ទាល់📥លáŸážážœáŸážšáž›áž»áž™0967699965,1806798461AhhLyn1213,808888756,2482753619838480608,Ly-លី🌈â¤ï¸,Zisy Ly បោះដុំនឹងលក់រាយកម្មង់ពីរោងចក្រផ្ទាល់📥លáŸážážœáŸážšáž›áž»áž™0967699965,1806798461ahhly09,938983724,-8302570306911018211,方塔莉,Zisy Ly បោះដុំនឹងលក់រាយកម្មង់ពីរោងចក្រផ្ទាល់📥លáŸážážœáŸážšáž›áž»áž™0967699965,1806798461ahh_vong,873218908,1743989214734522713,Mek Sreyvong,Zisy Ly បោះដុំនឹងលក់រាយកម្មង់ពីរោងចក្រផ្ទាល់📥លáŸážážœáŸážšáž›áž»áž™0967699965,1806798461ahhnitaccd,5420585351,-6331445989210603589,NITA CCD,Zisy Ly បោះដុំនឹងលក់រាយកម្មង់ពីរោងចក្រផ្ទាល់📥លáŸážážœáŸážšáž›áž»áž™0967699965,1806798461c. The output file as file2-nodups.csv, should be:
username,user id,access hash,name,group,group idAhhLyn1213,808888756,2482753619838480608,Ly-លី🌈â¤ï¸,Zisy Ly បោះដុំនឹងលក់រាយកម្មង់ពីរោងចក្រផ្ទាល់📥លáŸážážœáŸážšáž›áž»áž™0967699965,1806798461ahhly09,938983724,-8302570306911018211,方塔莉,Zisy Ly បោះដុំនឹងលក់រាយកម្មង់ពីរោងចក្រផ្ទាល់📥លáŸážážœáŸážšáž›áž»áž™0967699965,1806798461ahh_vong,873218908,1743989214734522713,Mek Sreyvong,Zisy Ly បោះដុំនឹងលក់រាយកម្មង់ពីរោងចក្រផ្ទាល់📥លáŸážážœáŸážšáž›áž»áž™0967699965,1806798461ahhnitaccd,5420585351,-6331445989210603589,NITA CCD,Zisy Ly បោះដុំនឹងលក់រាយកម្មង់ពីរោងចក្រផ្ទាល់📥លáŸážážœáŸážšáž›áž»áž™0967699965,1806798461I have tried the following codes:
with open('file1.csv', 'r', encoding="utf8") as t1: fileone = t1.readlines()with open('file2.csv', 'r', encoding="utf8") as t2: filetwo = t2.readlines()# scans through the two files and writes differences to new csvwith open('file2-nodups.csv', 'w', encoding="utf8") as outFile: for line in filetwo: if line not in fileone: outFile.write(line)The above does not work - because the output file (file2-nodups.csv) has the same content as file2.csv
Very appreciate any advice.
I would like to confirm that the above codes are Working for the following data:
file1.csv: username,user id,access hash,name,group,group id asgie2,19933,29kd982hi4hh6h443, 47uuha,491920,kdsagku5kkajgjag, james_sing,4002899,4asg37yragdh300asgdlk, joe_naro,4989222,hgjhe84jkaglagjj, 48700245,hlvkiiwej8njnnrk320kc, file2.csv: username,user id,access hash,name,group,group id misschue,87340a,hgeikka83llagea, james_sing,4002899,4asg37yragdh300asgdlk, michell22,4883140,cn2ukkfhiigakgd3yhg, Output file: username,user id,access hash,name,group,group id misschue,87340a,hgeikka83llagea, michell22,4883140,cn2ukkfhiigakgd3yhg,But they dont work for data with more columns above. Very appreciate any advice.