Crystal Speech

The Crystal Speech Operator is an advanced AI-powered denoiser that enhances live audio streams by eliminating unwanted background noise in real time. Whether you're in a noisy environment or dealing with microphone interference, this feature ensures that only clear, high-quality speech comes through. By isolating voices and filtering out distractions, Crystal Speech delivers crisp, uninterrupted audio for a seamless listening experience.

Note!

The Crystal Speech intoduces a few frames of latency.
For optimal synchronization between audio and video, consider using the Video Delay Operator in conjunction with Crystal Speech. This ensures that any audio processing delay introduced by Crystal Speech is matched by a corresponding video delay, keeping your audio and video in sync during live streams or recordings.

Audio Output and Mono Processing

To achieve the best noise reduction in real time, Crystal Speech processes audio in mono. By working with a single channel, it ensures clearer, more consistent speech while effectively eliminating unwanted background noise. This approach is standard in many high-performance denoisers, enabling efficient and precise noise filtering.

Note

This operator displays the audio input level before processing and the output level after the operator has been applied, allowing you to monitor how the adjustments affect the signal.

A miniature audio meter (VU meter) in the header indicates incoming audio, so you can quickly verify that the operator is receiving audio even when it is collapsed.

Tip

Use Settings to adjust how long the signal overload indicator stays active, and Project Options to change the maximum peak level displayed in all audio meters to your preference.

State

The State property shows the current status of Crystal Speech, indicating whether it is active, idle, or if an issue has occurred. This allows users to quickly understand the feature’s operational state at any given moment.

Action

The Action section lets users control when Crystal Speech starts and stops.

Auto-Start: When enabled, Crystal Speech will automatically begin processing audio when the application starts.
Start: Manually activates Crystal Speech to begin denoising audio.
Stop: Turns off Crystal Speech, ending audio processing.

Dry/Wet Level

The Dry/Wet Level section gives users fine control over how much of the original audio is preserved versus how much is processed by Crystal Speech, allowing for a customizable balance between clarity and authenticity.

Level (%):
Adjusts the blend between the original (dry) and processed (wet) audio. A lower percentage keeps more of the raw sound, while a higher percentage enhances noise reduction for an ultra-clear output. This allows users to fine-tune the effect to match their environment and preferences.
Bypass:
Instantly routes the original audio through without applying noise reduction, allowing the user to compare the raw and processed sound in real time without stopping Crystal Speech.
Reset:
Restores the settings in this section to their default values.

Latency

The latency information provides real-time insights into the slight delay introduced by Crystal Speech during audio processing. Because noise reduction requires analyzing and filtering out unwanted sounds, a small amount of processing time is necessary before delivering the cleaned audio. This delay is minimal but can vary depending on system performance and processing complexity. By offering precise latency measurements, this section helps users understand how Crystal Speech balances real-time performance with high-quality noise reduction, ensuring the cleanest possible audio with minimal impact on responsiveness.

Use the Video Delay operator to compensate for audio latency introduced by Crystal Speech.

Audio Latency (Frames):
Displays how many video frames the audio has been delayed due to Crystal Speech’s processing. This helps users understand any potential audio-visual desynchronization in live streams.
Audio Latency (ms):
Shows the delay in milliseconds caused by noise reduction. This allows users to monitor and optimize their setup to maintain smooth and natural-sounding audio playback.

Documentation Index

Crystal Speech

State

Action

Dry/Wet Level

Latency