I want to take some simple xml files and convert them all to CSV in one go (though this code is just for one at a time). It looks to me like there are no official name spaces, but I'm not sure.I have this code (I used one header, SubmittingSystemVendor
, but I really want to write all of them to CSV:
import csvimport lxml.etreex = r'C:\Users\...\jh944.xml'with open('output.csv', 'w') as f: writer = csv.writer(f) writer.writerow('SubmittingSystemVendor') root = lxml.etree.fromstring(x) writer.writerow(row)
Here is a sample of the XML file:
<?xml version="1.0" encoding="utf-8"?><EOYGeneralCollectionGroup SchemaVersionMajor="2014-2015" SchemaVersionMinor="1" CollectionId="157" SubmittingSystemName="MISTAR" SubmittingSystemVendor="WayneRESA" SubmittingSystemVersion="2014" xsi:noNamespaceSchemaLocation="http://cepi.state.mi.us/msdsxml/EOYGeneralCollection2014-20151.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><EOYGeneralCollection><SubmittingEntity><SubmittingEntityTypeCode>D</SubmittingEntityTypeCode><SubmittingEntityCode>82730</SubmittingEntityCode></SubmittingEntity>
The error is:
lxml.etree: Start tag expected, '<' not found, line 1, column 1