I just downloaded the TCGA genomic dataset, which is structured with genomic data files into a folder for each case with a total sample csv file provided for all of the files. The csv is structured in this way:
folder_name, file_namefolder1, file1.txtfolder2, file2.txtfolder3, file3.txt
And each file is a spreadsheet of the genes stacked vertically in this format:
file_path: folder1/file1.txt
geneA, 5geneB, 2geneC, 4
How can I write a loop to open each file to merge into each row iteratively to get the following format?
folder_name, file_name, geneA, geneB, geneCfolder1, file1.txt, 5, 2, 4folder2, file2.txt, 4, 3, 5folder3, file3.txt, 6, 2, 4
There could be files were one of the genes (eg geneB) is missing, in which case inputting a blank or n/a value could be acceptable.