Quantcast
Channel: Active questions tagged python - Stack Overflow
Viewing all articles
Browse latest Browse all 16595

Object tracking with yolov9 is producing an unopenable video file

$
0
0

I am trying to use the yolo9c object detection model: https://levelup.gitconnected.com/yolov9-faster-more-accurate-object-detection-with-revolutionary-techniques-1ef428d3950e

The purpose of my script is to identify the person that takes the most space in any given frame and reframe the video the video with that person in the center. I've eliminated a few sources of error (as far as I can tell) including [1] no person detected and [2] video encoding.

[1]: The output shows instances of people detected: 0:384x640 1 person, 1 bottle, 1 cup, 1 chair, 2 potted plants, 4 books, 478.6msSpeed: 2.7ms preprocess, 478.6ms inference, 1.2ms postprocess per image at shape (1, 3, 384, 640)10

and [2] I've encoded the video a number of ways but none of them seem to work.

Would love any insight or direction! Thanks in advance:

import subprocessimport numpy as npfrom ultralytics import YOLOimport cv2model = YOLO('yolov9c.pt')video_path = 'test/test.mp4'cap = cv2.VideoCapture(video_path)if not cap.isOpened():    print("Error: Could not open video.")    exit()# Output video resolutionoutput_width = 1080output_height = 1920# Output video fileoutput_video_path = 'output_video.mp4'# OpenCV properties for videofps = int(cap.get(cv2.CAP_PROP_FPS))frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))# Command for ffmpeg to write the reframed videoffmpeg_cmd = ['ffmpeg','-y','-f', 'rawvideo','-vcodec', 'rawvideo','-s', f'{output_width}x{output_height}','-pix_fmt', 'bgr24','-r', str(fps),'-i', '-','-c:v', 'libx264','-pix_fmt', 'yuv420p',              output_video_path]# Open ffmpeg processffmpeg_process = subprocess.Popen(ffmpeg_cmd, stdin=subprocess.PIPE)while True:    # Capture frame-by-frame    ret, frame = cap.read()    if not ret:        break    print("10")  # Print "10" for each frame    # Perform inference    results = model(frame)    # Extract bounding box coordinates and confidence scores    xyxy = results[0].boxes.xyxy.tolist()    conf = results[0].boxes.conf.tolist()    # Find person bounding box with the largest area    largest_area = 0    largest_box = None    for box, c in zip(xyxy, conf):        if box[-1] == 0:  # Class index 0 represents person            box_area = (box[2] - box[0]) * (box[3] - box[1])            if box_area > largest_area and c[0] > 0:                largest_area = box_area                largest_box = box    if largest_box is not None:        print("Largest box found:", largest_box)  # Print the largest_box if it's not None        # Calculate center of the largest person bounding box        center_x = int((largest_box[0] + largest_box[2]) / 2)        center_y = int((largest_box[1] + largest_box[3]) / 2)        # Calculate offset to center the person in the output frame        offset_x = max(0, int(center_x - output_width / 2))        offset_y = max(0, int(center_y - output_height / 2))        # Reframe the video with the person centered        reframed_frame = frame[offset_y:offset_y + output_height, offset_x:offset_x + output_width]        # Convert the frame to bytes and write to ffmpeg process stdin        ffmpeg_process.stdin.write(reframed_frame.tobytes())    if cv2.waitKey(1) & 0xFF == ord('q'):        breakprint("Finished writing")# Close ffmpeg process stdinffmpeg_process.stdin.close()# Wait for ffmpeg process to finishffmpeg_process.wait()cap.release()cv2.destroyAllWindows()

Tl;dr

Using yolo9c object tracking model which produces an output mp4 file. However, the output file cannot be opened by any video player because of 'compatibility issues'


Viewing all articles
Browse latest Browse all 16595

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>