Webcam Einbindung in Google Colab

YoungMau · Donnerstag 5. Mai 2022, 14:22

Hi Leute,

ich bin recht neu in dem Thema und möchte mit einem Video meiner Webcam ein Keras LSTM trainieren. In einem Jupyter Notebook funktionierte das ganze schon recht gut, allerdings würde ich gerne die Leistung mit einer Grafikkarte verbessern.
Da ich in meinem Macbook leider keine dedizierte Grafikkarte besitze, würde ich gerne auf die GPU Unterstützung von Google Colab zurückgreifen. In Google Colab möchte ich einen Video Stream meiner Webcam bzw. einer angeschlossenen Kamera verarbeiten, um damit das LSTM trainieren zu können. Leider funktioniert die Webcam Einbindung nicht ganz so einfach.

Ich habe schon einige Tipps aus dem Internet beachtet, finde jedoch keine finale Lösung und hoffe, dass mir hier jemand weiterhelfen kann

Zuerst wird der Stream aus der Webcam mit einem Java Script umgewandelt.

Code: Alles auswählen

def video_stream():
  js = Javascript('''
    var video;
    var div = null;
    var stream;
    var captureCanvas;
    var imgElement;
    var labelElement;
    
    var pendingResolve = null;
    var shutdown = false;
    
    function removeDom() {
       stream.getVideoTracks()[0].stop();
       video.remove();
       div.remove();
       video = null;
       div = null;
       stream = null;
       imgElement = null;
       captureCanvas = null;
       labelElement = null;
    }
    
    function onAnimationFrame() {
      if (!shutdown) {
        window.requestAnimationFrame(onAnimationFrame);
      }
      if (pendingResolve) {
        var result = "";
        if (!shutdown) {
          captureCanvas.getContext('2d').drawImage(video, 0, 0, 640, 480);
          result = captureCanvas.toDataURL('image/jpeg', 0.8)
        }
        var lp = pendingResolve;
        pendingResolve = null;
        lp(result);
      }
    }
    
    async function createDom() {
      if (div !== null) {
        return stream;
      }

      div = document.createElement('div');
      div.style.border = '2px solid black';
      div.style.padding = '3px';
      div.style.width = '100%';
      div.style.maxWidth = '600px';
      document.body.appendChild(div);
      
      const modelOut = document.createElement('div');
      modelOut.innerHTML = "<span>Status:</span>";
      labelElement = document.createElement('span');
      labelElement.innerText = 'No data';
      labelElement.style.fontWeight = 'bold';
      modelOut.appendChild(labelElement);
      div.appendChild(modelOut);
           
      video = document.createElement('video');
      video.style.display = 'block';
      video.width = div.clientWidth - 6;
      video.setAttribute('playsinline', '');
      video.onclick = () => { shutdown = true; };
      stream = await navigator.mediaDevices.getUserMedia(
          {video: { facingMode: "environment"}});
      div.appendChild(video);

      imgElement = document.createElement('img');
      imgElement.style.position = 'absolute';
      imgElement.style.zIndex = 1;
      imgElement.onclick = () => { shutdown = true; };
      div.appendChild(imgElement);
      
      const instruction = document.createElement('div');
      instruction.innerHTML = 
          '<span style="color: red; font-weight: bold;">' +
          'When finished, click here or on the video to stop this demo</span>';
      div.appendChild(instruction);
      instruction.onclick = () => { shutdown = true; };
      
      video.srcObject = stream;
      await video.play();

      captureCanvas = document.createElement('canvas');
      captureCanvas.width = 640; //video.videoWidth;
      captureCanvas.height = 480; //video.videoHeight;
      window.requestAnimationFrame(onAnimationFrame);
      
      return stream;
    }
    async function stream_frame(label, imgData) {
      if (shutdown) {
        removeDom();
        shutdown = false;
        return '';
      }

      var preCreate = Date.now();
      stream = await createDom();
      
      var preShow = Date.now();
      if (label != "") {
        labelElement.innerHTML = label;
      }
            
      if (imgData != "") {
        var videoRect = video.getClientRects()[0];
        imgElement.style.top = videoRect.top + "px";
        imgElement.style.left = videoRect.left + "px";
        imgElement.style.width = videoRect.width + "px";
        imgElement.style.height = videoRect.height + "px";
        imgElement.src = imgData;
      }
      
      var preCapture = Date.now();
      var result = await new Promise(function(resolve, reject) {
        pendingResolve = resolve;
      });
      shutdown = false;
      
      return {'create': preShow - preCreate, 
              'show': preCapture - preShow, 
              'capture': Date.now() - preCapture,
              'img': result};
    }
    ''')

  display(js)
  
def video_frame(label, bbox):
  data = eval_js('stream_frame("{}", "{}")'.format(label, bbox))
  return data

Hiermit wird das Java Script in ein OpenCV Bild umgewandelt.

Code: Alles auswählen

def js_to_image(js_reply):
  """
  Params:
          js_reply: JavaScript object containing image from webcam
  Returns:
          img: OpenCV BGR image
  """
  # decode base64 image
  image_bytes = b64decode(js_reply.split(',')[1])
  # convert bytes to numpy array
  jpg_as_np = np.frombuffer(image_bytes, dtype=np.uint8)
  # decode numpy array into OpenCV BGR image
  img = cv2.imdecode(jpg_as_np, flags=1)

  return img

# function to convert OpenCV Rectangle bounding box image into base64 byte string to be overlayed on video stream
def bbox_to_bytes(bbox_array):
  """
  Params:
          bbox_array: Numpy array (pixels) containing rectangle to overlay on video stream.
  Returns:
        bytes: Base64 image byte string
  """
  # convert array into PIL image
  bbox_PIL = PIL.Image.fromarray(bbox_array, 'RGBA')
  iobuf = io.BytesIO()
  # format bbox into png for return
  bbox_PIL.save(iobuf, format='png')
  # format return string
  bbox_bytes = 'data:image/png;base64,{}'.format((str(b64encode(iobuf.getvalue()), 'utf-8')))

  return bbox_bytes

Wenn ich nun aber meinen Code ausführe bekomme bekomme ich an der Stelle "while img.isOpened():" den Fehler "AttributeError: 'numpy.ndarray' object has no attribute 'isOpened'", was ich nicht verstehe, da ich vorher bei der Umwandlung ja bereits ein cv2 Bild vorliegen habe und somit die .isOpened Funktion aufrufbar sein sollte..

Code: Alles auswählen

#Schnittstelle zur Webcam 
video_stream()
label_html = 'Capturing...'
bbox = ''
js_reply = video_frame(label_html, bbox)
img = js_to_image(js_reply["img"])

# Festlegung mediapipe Modell 
with mp_holistic.Holistic(min_detection_confidence=0.8, min_tracking_confidence=0.8) as holistic:
    while img.isOpened():

      # Einlesen Bild
      ret, frame = cap.read()

      # Merkmale erkennen
      image, results = mediapipe_detection(frame, holistic)
      print(results)

      # Merkmale darstellen
      draw_styled_landmarks(image, results)

      # Ausgabe auf Bildschirm
      cv2.imshow('OpenCV Feed', image)

      # Beenden
      if cv2.waitKey(10) & 0xFF == ord('q'):
        break
img.release()
cv2.destroyAllWindows()

YoungMau · Montag 9. Mai 2022, 15:19

Hat niemand eine Idee woran es liegen könnte?

__deets__ · Montag 9. Mai 2022, 16:10

Ein numpy-Array hat kein isOpenend. Ein Capture-Objekt hat das. https://docs.opencv.org/3.4/d8/dfe/clas ... b328038585

ThomasL · Montag 9. Mai 2022, 16:48

Eventuell hilft dir das hier: https://colab.research.google.com/noteb ... iqYx97hPMi

oder das hier: https://www.youtube.com/watch?v=YjWh7QvVH60