Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 13891

Using BeautifulSoup to programmatically compare financial website data

$
0
0

I'm new to trying BeautifulSoup after reading such great things about it, but I frustratingly hit a wall very early.

My idea was to get data from Marketwatch and Google Finance and compare what should be the same analytic. My first case was to get the EPS estimates and actuals for the past 4 years/quarters & the coming quarter/year. I started with Marketwatch, and it's contained in this page (for AAPL).

https://www.marketwatch.com/investing/stock/aapl/analystestimates

However, after adding my actual Chrome User-Agent as a User-Agent to my header, it returns a page that says I need to upgrade my browser. I then blindly copied all of my Chrome request headers and tried again, same result.

from bs4 import BeautifulSoupimport requestsurl = "https://www.marketwatch.com/investing/stock/inmd/analystestimates"headers = {'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7','Accept-Encoding': 'gzip, deflate, br','Accept-Language': 'en-US,en;q=0.9','Cache-Control': 'max-age=0','Cookie': 'letsGetMikey=enabled; refresh=off; letsGetMikey=enabled; refresh=off; gdprApplies=false; ab_uuid=55e2945f-a2ec-43a0-a847-e8816e3afdda; dnsDisplayed=undefined; ccpaApplies=false; signedLspa=undefined; _pubcid=fa0bf288-7d62-4110-8372-1b024d6b4f7d; _sp_su=false; ccpaUUID=326cb90b-0dd7-4910-8d9d-5ff61c05bae5; permutive-id=af5bf5ac-6fdf-42cf-93e8-f723fa40f521; vcdpaApplies=false; regulationApplies=gdpr%3Afalse%2Ccpra%3Afalse%2Cvcdpa%3Afalse; usr_bkt=K8ZnHJ5Uyn; _mfuuid_=cc89c62a-e0bd-47b4-b5e1-b69f5ec44604; djvideovol=1; AMCVS_CB68E4BA55144CAA0A4C98A5%40AdobeOrg=1; _pnvl=false; pushly.user_puuid=SgW5vS3oF2lfhrcel0zRmnFvK4SgPd6F; _rdt_uuid=1697043746155.909bd6b4-3c1f-4ede-ae69-19c3fbcdaa48; s_cc=true; _pcid=%7B%22browserId%22%3A%22lnm012x2m39y5ji8%22%7D; cX_P=lnm012x2m39y5ji8; _pctx=%7Bu%7DN4IgrgzgpgThIC4B2YA2qA05owMoBcBDfSREQpAeyRCwgEt8oBJAEzIE4AmHgZi4CsvAIwB2DqIAMADkHTRvEAF8gA; _ncg_domain_id_=63b14ff9-3254-4fd6-bbb3-54f2adb4bf7a.1.1697043745806.1760115745806; _dj_sp_id=6daf2311-13b7-45df-b86c-b136409b9df8; _ncg_g_id_=2a1dc274-c24c-4613-bc6c-85cbe03b9692.1.1697043745.1760115745806; cX_G=cx%3A1v0qwnukyiwjo3v0qptcgcbksq%3A1si5pep8z172j; _pnlspid=11018; _ncg_id_=63b14ff9-3254-4fd6-bbb3-54f2adb4bf7a; _pnss=blocked; letsGetMikey=enabled; wsjregion=na%2Cus; _cls_v=bc22f7d5-a749-4df7-b90a-5b281bd95466; _cls_s=8c158253-143e-4031-8bce-ca64366bc44a:0; cls_e=8c158253-143e-4031-8bce-ca64366bc44a:244350746520436; s_tp=4367; _dj_id.cff7=.1697043747.44.1704301559.1704246489.1dc1df74-9643-41fb-80f4-eed69d609e2a; _ncg_sp_id.f57d=63b14ff9-3254-4fd6-bbb3-54f2adb4bf7a.1697043747.44.1704301559.1704246490.82d9d318-c56b-464a-ac34-10e617082e7c.55b6e9a4-459c-4a61-b6aa-bf5b1b137ef7.e7a54b7a-0e3d-405e-9ccd-475dd2a18017.1704301133757.6; s_ppv=MW_Summaries_Economy%2520%2526%2520Politics_U.S.%2520Economic%2520Calendar%2C36%2C27%2C1554; fullcss-home=site-37758705d2.min.css; refresh=off; fullcss-quote=quote-4f7c97120b.min.css; kayla=g=9b6d321ca05b40eb84182ab4e98ab83e; mw_loc=%7B%22Region%22%3A%22MI%22%2C%22Country%22%3A%22US%22%2C%22Continent%22%3A%22NA%22%2C%22ApplicablePrivacy%22%3A0%7D; icons-loaded=true; fullcss-section=section-4063dd6ae2.min.css; recentqsmkii=Stock-US-INMD|Stock-US-HPE|Stock-US-UNIT|Stock-US-MBLY|Stock-US-A|Stock-US-AEHR|Stock-US-AAPL|CloseEndFund-US-FTHY|CloseEndFund-US-ECAT|CloseEndFund-US-FSCO|PreferredStock-US-MGR|Stock-CA-LTHM|Stock-US-CVX; mw_bulletins=SE58ay; _pubcid_cst=kSylLAssaw%3D%3D; DJSESSION=country%3Dus%7C%7Ccontinent%3Dna%7C%7Cregion%3Dmi; usr_prof_v2=eyJpYyI6NH0%3D; spotim_visitId={%22creationDate%22:%22Wed%20Jan%2010%202024%2017:15:19%20GMT-0500%20(Eastern%20Standard%20Time)%22%2C%22duration%22:0}; utag_main=v_id:018b1fb0868d0007c45f252153a90506f003506700a83$_sn:47$_ss:1$_st:1704926728972$vapi_domain:marketwatch.com$ses_id:1704924928972%3Bexp-session$_pn:1%3Bexp-session$_prevpage:MW_Company%20Analyst%20Estimates%3Bexp-1704928528979; AMCV_CB68E4BA55144CAA0A4C98A5%40AdobeOrg=1585540135%7CMCIDTS%7C19733%7CMCMID%7C89889684677145527779162982137623154110%7CMCAID%7CNONE%7CMCOPTOUT-1704932129s%7CNONE%7CvVersion%7C4.4.0%7CMCAAMLH-1704905931%7C7%7CMCAAMB-1704924927%7Cj8Odv6LonN4r3an7LhD3WZrU1bUpAkFkkiY1ncBR96t2PTI%7CMCSYNCSOP%7C411-19733','Dnt': '1','Referer': 'https://www.marketwatch.com/investing/stock/inmd/analystestimates','Sec-Ch-Ua': '"Not_A Brand";v="8", "Chromium";v="120", "Google Chrome";v="120"','Sec-Ch-Ua-Mobile': '?0','Sec-Ch-Ua-Platform': '"Windows"','Sec-Fetch-Dest': 'document','Sec-Fetch-Mode': 'navigate','Sec-Fetch-Site': 'same-origin','Sec-Fetch-User': '?1','Upgrade-Insecure-Requests': '1','User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',}html_content = requests.get(url, headers=headers).textsoup = BeautifulSoup(html_content, "lxml")print (soup)

yields

...<p class="text">This browser is no longer supported at MarketWatch. For the best MarketWatch.com experience, please update to a modern browser.</p></div><div class="group group--buttons"><a class="btn btn--primary" href="https://www.google.com/chrome/">Chrome</a><a class="btn btn--primary" href="https://support.apple.com/downloads/safari">Safari</a><a class="btn btn--primary" href="https://www.mozilla.org/en-US/firefox/">Firefox</a><a class="btn btn--primary" href="https://www.microsoft.com/en-us/windows/microsoft-edge">Edge</a>...

:sound of brakes screeching:

Am I missing an important step here? Is BeautifulSoup the wrong tool for the job?


Viewing all articles
Browse latest Browse all 13891

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>