Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 13951

Accessing nested element using beautifulsoup

$
0
0

I want to find all the li elements nested within <ol class="messageList" id="messageList">. I have tried the following solutions and they all return 0 messages:

messages = soup.find_all("ol")messages = soup.find_all('div', class_='messageContent')messages = soup.find_all("li")messages = soup.select('ol > li')messages = soup.select('.messageList > li')

The full html can be seen here in this gist.

  1. Just wondering what is the correct way of grabbing these list items.
  2. In beautiful soup do you have to know the nested path to get the element you are after. Or would doing something like soup.find_all("li") supposed to return all elements, whether it's nested or not?

Happy for non-bs4 answers too.

Update

This is how I got the code.

from bs4 import BeautifulSoup# Load the HTML contentwith open('/tmp/property.html', 'r', encoding='utf-8') as file:    html_content = file.read()# Create a BeautifulSoup object and specify the parsersoup = BeautifulSoup(html_content, 'html.parser')

The file is in the gist link above.

Update 2

I got it working using requests library. Looks like manually downloading the file might have caused some of the html to break?

import requestsfrom bs4 import BeautifulSoupurl = "https://www.propertychat.com.au/community/threads/melbourne-property-market-2024.75213/"response = requests.get(url)soup = BeautifulSoup(response.text, "html.parser")messages = soup.select('.messageList > li')

Viewing all articles
Browse latest Browse all 13951

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>