Splitted PDF Files are being duplicated when Splitted using Text

I am trying to split 100 of PDFs with python by specific keyword.If a page in Python contains that keyword, split that page into a new PDF.The Problem i am facing is, the files are being duplicated. I have tried everything to stop the duplication process but it won't change.

import osimport fitz  # PyMuPDFdef split_pdf_by_text(pdf_path, keyword, output_folder):    # Check if the provided path is a valid file    if not os.path.isfile(pdf_path):        print(f"Error: '{pdf_path}' is not a valid file.")        return    # Create output folder if it doesn't exist    if not os.path.exists(output_folder):        os.makedirs(output_folder)    print(f"Processing PDF: {pdf_path}")    # Open the PDF file    pdf_document = fitz.open(pdf_path)    # Initialize a set to keep track of processed pages    processed_pages = set()    # Iterate through each page    for page_number in range(len(pdf_document)):        # Skip the page if it's already processed        if page_number in processed_pages:            continue        # Get the page        page = pdf_document.load_page(page_number)        # Extract text from the page        text = page.get_text()        # Check if the keyword exists in the page text        if keyword in text:            # Construct the output file path            output_file_name = f"{os.path.splitext(os.path.basename(pdf_path))[0]}_page_{page_number + 1}.pdf"            output_path = os.path.join(output_folder, output_file_name)            # Create a new PDF document            new_pdf = fitz.open()            new_pdf.insert_pdf(pdf_document, from_page=page_number, to_page=page_number)  # Insert the current page into the new PDF            new_pdf.save(output_path)  # Save the new PDF            print(f"Page {page_number + 1} saved to: {output_path}")            # Close the new PDF            new_pdf.close()            # Add the page number to the set of processed pages            processed_pages.add(page_number)    # Close the original PDF    pdf_document.close()# Define the function to process all PDF files in a directorydef process_all_pdfs(input_folder, keyword, output_folder):    # Iterate through each file in the input folder    for root, _, files in os.walk(input_folder):        for file in files:            if file.endswith(".pdf"):                # Get the full path of the PDF file                pdf_path = os.path.join(root, file)                # Process the PDF file                split_pdf_by_text(pdf_path, keyword, output_folder)

If a page was processed, I tried to skip it and move along to the next page, but it's not working out for me.

Splitted PDF Files are being duplicated when Splitted using Text - Python Script

Trending Articles

RAMAYAMPET Mandal Sarpanch | Upa-Sarpanch | Ward member Mobile Numbers Medak...

लड़कियां सेक्स के दौरान क्यों करती है उह! आह!लड़कियां सेक्स के दौरान क्यों करती...

Neem Baba Extra Questions Answer Class 6 English Poorvi

Throw Back: 4×4 — Sikilitele (Ft Castro) Prod by JQ

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Lowe faces four theft charges

Practice Sheet of Right form of verbs for HSC Students

Mafia, Murder & Mayhem In The Motor City: Detroit Mob Hit Timeline (1937-2007)

The 10 Tennessee Cities With The Largest Black Population For 2021

Materials Around Us Class 6 Worksheet Science Chapter 6

デスクトップヒープの枯渇

Best Suvichar in Hindi |बेस्ट सुविचार |शुभ विचार हिंदी में

Kanulanu Thaake Lyrics and translation | Manam (2014)

Korean Sex Porn Videos: XXX Videos & Free Porn Movies

Teen Shot In Miami Drive-By Dies From Injuries

Download: IQ Muzatasha feat Shy D & Pmj – Ulesi NiFertilizer Yamavuto

Mahakal Attitude Status

Property developer set up cannabis factory to help pay off debts...

♡

KB: How to troubleshoot issues when adding a Hyper-V host in System Center...