Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 23131

Obtain href links from first table in headless browser page (playwright._impl._errors.Error: Event loop is closed! Is Playwright already stopped?)

$
0
0

I am trying to get the href links from the first table of a headless browser page but the error isn't helping me as it's not telling me what it is, just lots of ^ symbols underneath.

I had to switch to a headless browser because I was scraping empty tables for how the site's HTML works and I admit I don't understand how it works.

I also want to complete the links so that they work for further use, which is the last three lines of the following code:

from playwright.sync_api import sync_playwright# headless browser to scrapewith sync_playwright() as p:    browser = p.chromium.launch()    page = browser.new_page()    page.goto("https://fbref.com/en/comps/9/Premier-League-Stats")#open the file upwith open("path", 'r') as f:    file = f.read()years = list(range(2024,2022, -1))all_matches = []standings_url = "https://fbref.com/en/comps/9/Premier-League-Stats"for year in years:    standings_table = page.locator("table.stats_table").first    link_locators = standings_table.get_by_role("link").all()    for l in link_locators:        l.get_attribute("href")    print(link_locators)    link_locators = [l for l in links if "/squads/" in l]    team_urls = [f"https://fbref.com{l}" for l in link_locators]    print(team_urls)browser.close()

The stack trace I get is just:

Traceback (most recent call last):  File "path", line 27, in <module>    link_locators = standings_table.get_by_role("link").all()                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^  File "path\.venv\Lib\site-packages\playwright\sync_api\_generated.py", line 15936, in all    return mapping.from_impl_list(self._sync(self._impl_obj.all()))                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^  File "path\.venv\Lib\site-packages\playwright\_impl\_sync_base.py", line 102, in _sync    raise Error("Event loop is closed! Is Playwright already stopped?")playwright._impl._errors.Error: Event loop is closed! Is Playwright already stopped?Process finished with exit code 1

My code is only 33 lines as it's the start of a loop, so I'm unsure what the last two errors in the stack refer to.

I just can't extract the href links. It might have to do with .first.

I implemented the solution from Get href link using python playwright but it doesn't work.


Viewing all articles
Browse latest Browse all 23131

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>