Hello I have a weird problem in Python using wget, will be so grateful if someone could give me a help.
what I want to do :
download the file('.pdf','.djvu') from specific website(ex. wiki) with wget, Python. which should be easy.
specific page I'm trying to do web crawl
getting the file link for wget
Problem :
it's really weird. At most pages in website, it works well.
But some pages with same HTML structure, it doesn't work.
Even in the same page, some files downloads well with wget but some doesn't
and getting this error message
Error message :`C:\start_automation\crawling_job>C:/Users/sa031/AppData/Local/Programs/Python/Python311/python.exe c:/start_automation/crawling_job/download_test.pyTraceback (most recent call last): File "c:\start_automation\crawling_job\download_test.py", line 39, in <module> wget.download(url) File "C:\Users\sa031\AppData\Local\Programs\Python\Python311\Lib\site-packages\wget.py", line 303, in download (fd, tmpfile) = tempfile.mkstemp(".tmp", prefix=prefix, dir=".") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\sa031\AppData\Local\Programs\Python\Python311\Lib\tempfile.py", line 341, in mkstemp return _mkstemp_inner(dir, prefix, suffix, flags, output_type) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\sa031\AppData\Local\Programs\Python\Python311\Lib\tempfile.py", line 256, in _mkstemp_inner fd = _os.open(file, flags, 0o600) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^FileNotFoundError: [Errno 2] No such file or directory: 'C:\\start_automation\\crawling_job\\CADAL06210101_%E7%9A%87%E6%B8%85%E7%B6%93%E8%A7%A3%E7%BA%8C%E7%B7%A8%EF%BC%9A%E6%98%93%E5%9C%96%E6%A2%9D%E8%BE%AE%E7%9A%87%E6%B8%85%E7%B6%93%E8%A7%A3%E7%BA%8C%E7%B7%A8%EF%BC%9A%E8%99%9E%E6%B0%8F%E6%98%93%E4%BA%8B.djvu.3kii8ipd.tmp'`What I have done :
googled, tested with several different pages in the wiki.
asking chatGPT and get the code with absolute path but doesn't work
import osimport wgetdef download_file(url, save_path): try: print("Downloading file...") wget.download(url, save_path) print("\nDownload complete!") except Exception as e: print(f"An error occurred: {e}")if __name__ == "__main__": # URL of the file to download file_url = "https://upload.wikimedia.org/wikipedia/commons/a/a7/CADAL06210101_%E7%9A%87%E6%B8%85%E7%B6%93%E8%A7%A3%E7%B6%8C%E7%B7%A8%EF%BC%9A%E6%98%93%E5%9C%96%E6%A2%9D%E8%BE%AE%E7%9A%87%E6%B8%85%E7%B6%93%E8%A7%A3%E7%B6%8C%E7%B7%A8%EF%BC%9A%E8%99%9E%E6%B0%8F%E6%98%93%E4%BA%8B.djvu" # Specify an absolute path for saving the file save_location = os.path.join(os.getcwd(), "downloaded_file.djvu") # Call the function to download the file download_file(file_url, save_location)The code :
The code below is the code with URL included which doesn't work.
import wgeturl='https://upload.wikimedia.org/wikipedia/commons/a/a7/CADAL06210101_%E7%9A%87%E6%B8%85%E7%B6%93%E8%A7%A3%E7%BA%8C%E7%B7%A8%EF%BC%9A%E6%98%93%E5%9C%96%E6%A2%9D%E8%BE%AE%E7%9A%87%E6%B8%85%E7%B6%93%E8%A7%A3%E7%BA%8C%E7%B7%A8%EF%BC%9A%E8%99%9E%E6%B0%8F%E6%98%93%E4%BA%8B.djvu'wget.download(url)maybe :
.djvu.3kii8ipd.tmp'
problem with this weird .tmp name shown on error message but have no idea.
Thanks for reading. Appreciate so much for the help.