Audible clicks when changing pitch-shift factor

(From https://github.com/breakfastquay/rubberband/issues/30 reported by Daniele Ghisi; also reported by a correspondent earlier in the year)

Audible clicks are sometimes heard when the pitch-shift factor is changed.

This can only happen in real-time mode, as the pitch factor is fixed in offline mode.

Currently investigating this in the pitch-review branch. Although pitch shifting to fixed ratios is quite well-tested, there have historically been no direct quality or regression tests for live pitch changes, which is a major omission.

The situation here depends on the resampler in use, but none is without problems:

  • Speex does not smooth filter updates, so there is almost always a second-order discontinuity at the boundary where the rate changes (often manifesting as a repeated sample in the output).

  • Libsamplerate is designed to smooth filter updates, so that it can be used for variable rate changes without audible artifacts. However, it doesn't always seem to get it quite right. Sometimes a rate change will yield a discontinuity in the middle of the subsequent processing block, suggesting something may be wrong with the smoothing itself. Although these artifacts are less common than those produced by the Speex wrapper, they are usually louder.

  • Libsamplerate can also be told not to smooth its filter changes (by setting the new ratio explicitly rather than updating it within the process structure), making it behave more like Speex.

  • The IPP wrapper produces awful artifacts on rate changes, well outside the scope of this report!

With libsamplerate there is a rather blunt tactic available (tested in pitch-review, see PERFORM_LIBSAMPLERATE_XFADE define) which is to duplicate the resampler on each rate change and feed a short prefix of the input to the "spare" resampler as well as to the main one. The main resampler is configured not to smooth its filter changes, so avoiding the mid-block discontinuities; the "spare" one is configured to do so; and we cross-fade the output from the spare one back to the main one at the start of the output block. This does basically work, but it depends on a new API symbol (src_clone) added to libsamplerate since the current official release 0.1.9, and it introduces further allocations in the main process flow.

Assigned to
8 months ago
8 months ago
No labels applied.

~breakfastquay 8 months ago

Probably best thing to do next is dig down in libsamplerate, see if I can find a small misbehaving case, and fix or seek help upstream there.

Now that libsamplerate has a BSD-style licence, it seems reasonable to suggest that anyone wanting dynamic pitch changing should use the libsamplerate resampler rather than any of the alternatives.

~cannam 8 months ago*

(same author but switching to personal login)

Some investigation later, I found two separate problems that both produce similar-sounding artifacts (when using libsamplerate - as mentioned above, there are other problems with the other libraries and that's not changing just at the moment):

  • In Resampler the output frame space provided needed to be (at least) one sample larger, or else the build-up of accumulated fractional samples would eventually result in a lost sample. Fixed in commit:dbfc749bce16 in the default branch.

  • In StretcherProcess::writeChunk, when zeroing the tail of the accumulator and window accumulator after shifting, the buffer was zeroed starting from the current window length - but it was possible that it still had valid values after that point in the case where the window length had been reduced recently because of a change in ratio. This code needed to use the accumulator-fill length instead of the current window length. Fixed in commit:6c0f84bf30b3 in the default branch.

I'm still reviewing these and looking at some regression tests but so far these appear to amount to a complete fix for the issue. Any feedback based on testing the current default branch (with libsamplerate) is welcome.