FYI I am a complete Python novice. I have a for loop that is extracting some object info from an S3 bucket and populating it into a csv file. For every object for which the details are retrieved, I need that data to be populated to a csv. My issue is I am getting duplicate entries in the csv. What I am expecting in the csv is:
account_id;arn
key1;body1
key2;body2
key3;body3... (until the loop runs through all objects in that folder).
But what I am getting is (below).
account_id;arn
key1;body1
account_id;arn
key1;body1
account_id;arn
key2;body2
account_id;arn
key1;body1
account_id;arn
key2;body2
account_id;arn
key3;body3
Also every time i run the script, it keeps adding the old data which is kind of multiplying the problem.
My current piece of code is:
for objects in my_bucket.objects.filter(Prefix="folderpath"): key = objects.key body = objects.get()['Body'].read() field = ["account_id","arn"] data = [ [key, body] ] with open("my_file.csv", "a") as f: writer = csv.writer(f, delimiter=";", lineterminator="\\n") writer.writerow(field) writer.writerows(data)