Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Safari 16.5 and iOS Chrome #58

Open
darioalessandro opened this issue Jun 11, 2023 · 21 comments
Open

Support Safari 16.5 and iOS Chrome #58

darioalessandro opened this issue Jun 11, 2023 · 21 comments
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed

Comments

@darioalessandro
Copy link
Member

Seems like Safari is missing some classes that our App relies on, we need to find a way around it or at least show an error:

Screenshot 2023-06-11 at 6 02 42 PM

@darioalessandro darioalessandro added bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed labels Jun 11, 2023
@alcolmenar
Copy link

alcolmenar commented Jul 21, 2023

@darioalessandro hey! thank you for releasing this project. I really think it's a great idea. I did spend some time looking at this issue and it essentially boils down to Safari not currently supporting the MediaStreamTrackGenerator/Processor Web APIs.

https://developer.mozilla.org/en-US/docs/Web/API/MediaStreamTrackGenerator#browser_compatibility

They actually just added support for the WebCodecs API which enables VideoEncoder in version 16.4 so it seems slow going. I had found some examples where they used VideoStreamReader/Writer and those have been deprecated already and aren't included in web_sys anymore.

It seems like we might be able to use a TransformStream, but my knowledge of Web API's is pretty limited...

@darioalessandro
Copy link
Member Author

darioalessandro commented Jul 21, 2023

Thanks for looking into this issue!!

I think it is worth trying @alcolmenar would you like to take on this task?

@alcolmenar
Copy link

alcolmenar commented Jul 21, 2023

Thanks for looking into this issue!!

I think it is worth trying @alcolmenar would you like to take on this task?

@darioalessandro Sure I can give it a shot. I might need some assistance though at some point if I get stuck.

@darioalessandro
Copy link
Member Author

@alcolmenar how's it going?

@alcolmenar
Copy link

@darioalessandro hey! I've been a bit busy with other things so I haven't had a chance to start in this yet. I should get working on it later today or tomorrow

@darioalessandro
Copy link
Member Author

sgtm

@alcolmenar
Copy link

@darioalessandro I've been looking into adding Safari support and I think I've determined a way to do this, but I'm not sure if it is something we want to implement. The main issue here is that Safari doesn't currently support the MediaStreamTrackProcessor/Generator APIs which enables the ability to extract a ReadableStream from a MediaStreamTrack. One way I've seen this done is this:

  1. Play MediaStream on a video element
  2. Draw video frame on to a canvas element
  3. Get the canvas ImageData and queue this to a ReadableStream

Here is an example of this done in Javascript. https://github.com/GoogleChromeLabs/webm-wasm/blob/master/demo/live.html

We might be able to then use the same VideoEncoder created in camera_encoder.rs. I think one issue is that the canvas doesn't render unless it is in focus.

Also, I've been struggling a bit on how to implement the JsObject containing the methods necessary for the ReadableStream and I can't find any examples of this. I was thinking maybe we can pair on this if you're up for it?

@darioalessandro
Copy link
Member Author

Yes, this will work 😄 you can read the offscreen image straight into a video_frame:

if let Err(e) = context.draw_image_with_html_video_element_and_dw_and_dh(&video_element, 0.0, 0.0, VIDEO_WIDTH.into(), VIDEO_HEIGHT.into()) {
                        log!("error", e);
                    }

// create a JsDict with a timestamp property
let mut video_frame_init = VideoFrameInit::new();
video_frame_init.timestamp(0.0); // TODO: use an actual timestamp
video_frame_init.duration(1.0/30.0);

 let video_frame = VideoFrame::new_with_html_image_element_and_video_frame_init(
&html_image_element, 
 &video_frame_init);
                        

you can modify this method directly:

let mut video_frame_counter = 0;
let poll_video = async {
loop {
if !enabled.load(Ordering::Acquire)
|| destroy.load(Ordering::Acquire)
|| switching.load(Ordering::Acquire)
{
video_track
.clone()
.unchecked_into::<MediaStreamTrack>()
.stop();
video_encoder.close();
switching.store(false, Ordering::Release);
return;
}
match JsFuture::from(video_reader.read()).await {
Ok(js_frame) => {
let video_frame = Reflect::get(&js_frame, &JsString::from("value"))
.unwrap()
.unchecked_into::<VideoFrame>();
let mut opts = VideoEncoderEncodeOptions::new();
video_frame_counter = (video_frame_counter + 1) % 50;
opts.key_frame(video_frame_counter == 0);
video_encoder.encode_with_options(&video_frame, &opts);
video_frame.close();
}
Err(e) => {
log!("error", e);
}
}
}
};
poll_video.await;
log!("Killing video streamer");
});

You will need to enable the following features in the web-sys crate:

   "OffscreenCanvas",
    "OffscreenCanvasRenderingContext2d",
    "ImageData",
    "VideoFrameInit"
    ```

@alcolmenar
Copy link

alcolmenar commented Jul 28, 2023

@darioalessandro thank you for the response. it definitely helped me out. I was able to get camera_encoding working on Safari and was able to test it locally, but I had to remove the use of the MediaStreamTrackGenerator for Audio decoding. I put a up a Draft PR with the changes for you to check out. The video on the receive side seemed to render with the full height and width though and I couldn't seem to figure out how to adjust that without changing drawImage height and width. I thought maybe it was just a CSS change but that wasn't the case.

Also, while looking into the Audio side of the equation, it looks like Safari actually doesn't support the AudioEncoder as well and it looks like we'll need to use a third party package to do this encoding. Also on the decode side, Safari doesn't support MediaStreamTrackGenerator and AudioDecoder so we'll have to rethink how to implement this using the Web Audio APIs specifically the stream generation and decoding (most likely 3rd party package again). I tried to find some examples of doing the stream generation. All I could find are examples using ArrayBuffer, but with my limited Javascript/WASM knowledge I'm not sure this will work. Do you have any suggestions how to go about implementing this?

@darioalessandro
Copy link
Member Author

This is awesome! We use opus for audio encoding so I am sure that we can port it to wasm! This is awesome🏆

@darioalessandro darioalessandro changed the title Support Safari 16.5 Support Safari 16.5 and iOS Chrome Aug 2, 2023
@darioalessandro
Copy link
Member Author

Seems like iOS Chrome has the same issue!!

IMG_7468691FB502-1

@darioalessandro
Copy link
Member Author

Adding ChatGPT response regarding audio processing, let me know what you think:

if I am receiving the audio chunks via a websocket, how can I then play it using the audio context?

ChatGPT
Certainly! You can play audio chunks received via a WebSocket using the AudioContext. Here's a general outline of how you could do it:

Create the AudioContext and ScriptProcessorNode: You'll need to create an AudioContext and a ScriptProcessorNode (or AudioWorklet if available) to handle the incoming chunks.

Set Up the WebSocket: You'll need to set up the WebSocket connection to receive the audio chunks. Make sure you know the audio format and sample rate.

Decode and Play Audio Chunks: As chunks are received, you'll need to decode them (if necessary) and then add them to a buffer that the ScriptProcessorNode can play.

Here's some sample code that outlines this process:

var audioContext = new (window.AudioContext || window.webkitAudioContext)();
var scriptNode = audioContext.createScriptProcessor(4096, 1, 1); // Choose appropriate buffer size

var audioQueue = [];

scriptNode.onaudioprocess = function(event) {
  var output = event.outputBuffer.getChannelData(0);
  while (audioQueue.length && output.length > audioQueue[0].length) {
    var chunk = audioQueue.shift();
    output.set(chunk, 0); // copy chunk to output
    output = output.slice(chunk.length);
  }
  if (audioQueue.length > 0) {
    output.set(audioQueue[0].slice(0, output.length), 0);
    audioQueue[0] = audioQueue[0].slice(output.length);
  }
};

scriptNode.connect(audioContext.destination);

var socket = new WebSocket('ws://your-websocket-server');

socket.onmessage = function(event) {
  var audioData = new Float32Array(event.data); // Assuming audio data is sent as Float32Array
  // You might need to decode the data if it's compressed
  audioQueue.push(audioData);
};

socket.onopen = function() {
  // Send any necessary commands to start receiving audio
};

Note that this example assumes the audio is being sent in a compatible format (e.g., Float32Array) and doesn't require additional decoding. If the audio is being sent in a compressed format, you'll need to decode it before adding it to the queue.

Additionally, precise synchronization and buffer handling might require careful tuning and consideration of the specific requirements of your application.

This is a basic and illustrative example. For a production environment, more refined and error-tolerant code would be necessary. Make sure to handle different WebSocket events properly and control the audio buffer efficiently to avoid glitches or interruptions in the playback.

@alcolmenar
Copy link

alcolmenar commented Aug 2, 2023

Seems like iOS Chrome has the same issue!!

Ah that makes sense. From my understanding, iOS chrome uses the same engine as Safari

@darioalessandro
Copy link
Member Author

Damn! So chrome just uses the iOS WebView, that is bad :(

@alcolmenar
Copy link

@darioalessandro it appears that using a processor or a worklet in the Audio graph is the way to go here outside of using WebRTC. In the worklet, we'll need to use a 3rd party opus encoder / decoder. I was thinking about using this: https://github.com/mmig/opus-encdec

There don't seem to appear to be any Rust implementations of opus.

Also, this'll probably take me a bit of time to figure out. I'm pretty far out of my wheel house here

@darioalessandro
Copy link
Member Author

This helps a lot! Do you think we should fallback to webrtc for old browsers? We could terminate the webrtc connections on the backend and transform the streams to the other protobuf protocol

@alcolmenar
Copy link

I think there'd be better support for it since many of the video conferencing platforms use it. However, I do think that if we use WebRTC for part of the app, it'd make sense to use it for all of media transfers. But that'd be a large change and possibly not in the spirit of this app. There is an idea of an SFU (Selective Forwarding Unit) in the WebRTC world which is similar to what this app is doing but instead using web sockets and protobufs.

It can work using the solution above though and imo probably easier?

@darioalessandro
Copy link
Member Author

I think there'd be better support for it since many of the video conferencing platforms use it. However, I do think that if we use WebRTC for part of the app, it'd make sense to use it for all of media transfers. But that'd be a large change and possibly not in the spirit of this app. There is an idea of an SFU (Selective Forwarding Unit) in the WebRTC world which is similar to what this app is doing but instead using web sockets and protobufs.

It can work using the solution above though and imo probably easier?

Go with using a processor or a worklet in the Audio graph is the way to go here outside of using WebRTC. In the worklet, we'll need to use a 3rd party opus encoder / decoder. I was thinking about using this: https://github.com/mmig/opus-encdec

@darioalessandro
Copy link
Member Author

@alcolmenar are you pursuing this?

@alcolmenar
Copy link

hey ya still working on it. I should have something for the encode side soon

@allsey87
Copy link

According to MDN and caniuse.com, MediaStreamTrackProcessor should be available in Safari 18 which was released a couple weeks ago. However, I am still getting the same error when I try to call the MediaStreamTrackProcessor constructor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants