Quantcast
Viewing all articles
Browse latest Browse all 14011

refextract: stat: path should be string, bytes, os.PathLike or integer, not NoneType

I'm trying to use the Refextract library to extract citations from the 'references' session in a PDF academic paper.

Here is my code:

from refextract import extract_references_from_filedef extract_citations(uploaded_file_location):    try:         citations = extract_references_from_file(uploaded_file_location)        citations = merge_citations(citations)        unique_citations = {}        for citation in citations:            raw_ref = citation['raw_ref'][0]            if raw_ref not in unique_citations:                unique_citations[raw_ref] = citation            else:                pass        unique_citations_list = list(unique_citations.values())        return unique_citations_list    except Exception as e:        print(f"An error occurred: {e}")        return []

Just ignore the merge_citations function, the program run successfully on my localhost (I use Ubuntu 22.04), but when I run it on Docker (I use Docker Compose to run it all), there is the problem from docker logs:

TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType

Here is my Dockerfile:

FROM python:latestWORKDIR /appCOPY requirements.txt ./RUN pip install -r requirements.txtCOPY . .CMD ["python", "main.py"]

I want to access the Google Cloud Bucket, so I have already debugged and checked the file path, but it is correct, so it seems like the primary problem is caused by the extract_references_from_file function.

I've already searched, but there is not any satisfactory solution that works for me. So, I am happy for someone to help me fix it.


Viewing all articles
Browse latest Browse all 14011

Trending Articles