Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 14040

Python Beginner: Conveting an HTML document scraped from a website to a dataframe

$
0
0

I am trying to scrape football players' data from the website FBRef, I got the data from the website as a bs4.element.ResultSet object.

Code:

import requestsfrom bs4 import BeautifulSoupimport pandas as pdimport numpy as npres = requests.get("https://fbref.com/en/comps/9/stats/Premier-League-Stats")comp = re.compile("<!--|-->")soup = BeautifulSoup(comp.sub("",res.text),'lxml')all_data = soup.findAll("tbody")player_data = all_data[2]

The data is as follows:

<tr><th class="right" **...** href="/en/players/774cf58b/Max-Aarons">Max Aarons</a></td><td **...** data-stat="position">DF</td><td class="left" data-stat="team"><a href="/en/squads/4ba7cbea/Bournemouth-Stats">Bournemouth</a></td><td class="center" data-stat="age">24-084</td><td class="center" data-stat="birth_year">2000</td><td**...** </a></td></tr><tr><th class="right" **...** href="/en/players/77816c91/Benie-Adama-Traore">Bénie Adama Traore</a></td><td **...** data-stat="position">FW,MF</td><td class="left" data-stat="team"><a href="/en/squads/1df6b87e/Sheffield-United-Stats">Sheffield Utd</a></td><td class="center" data-stat="age">21-119</td><td class="center" data-stat="birth_year">2002 **...** </a></td></tr>**...**I want to create a Pandas data frame from this such as:

Name Position Team Age Birth Year...

Max Aarons DF Bournemouth 24 2000

Benie Adama Traore FW Sheffield Utd 21 2002...

Thanks in advanceLooked similar questions here and dried to apply the solutions but couldn't make it work

Viewing all articles
Browse latest Browse all 14040

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>