Added a check (in commit:f9a50353f998) for
inputIncrement < 1and to increase the input and output increments and window size accordingly.
We should still check & at least document a maximum stretch factor anyway, as the windowing doesn't work correctly for such extreme values - a factor of 640 works well but 6400 still does not, this fix notwithstanding.
Probably best thing to do next is dig down in libsamplerate, see if I can find a small misbehaving case, and fix or seek help upstream there.
Now that libsamplerate has a BSD-style licence, it seems reasonable to suggest that anyone wanting dynamic pitch changing should use the libsamplerate resampler rather than any of the alternatives.
(From https://github.com/breakfastquay/rubberband/issues/30 reported by Daniele Ghisi; also reported by a correspondent earlier in the year)
Audible clicks are sometimes heard when the pitch-shift factor is changed.
This can only happen in real-time mode, as the pitch factor is fixed in offline mode.
Currently investigating this in the
pitch-reviewbranch. Although pitch shifting to fixed ratios is quite well-tested, there have historically been no direct quality or regression tests for live pitch changes, which is a major omission.
The situation here depends on the resampler in use, but none is without problems:
Speex does not smooth filter updates, so there is almost always a second-order discontinuity at the boundary where the rate changes (often manifesting as a repeated sample in the output).
Libsamplerate is designed to smooth filter updates, so that it can be used for variable rate changes without audible artifacts. However, it doesn't always seem to get it quite right. Sometimes a rate change will yield a discontinuity in the middle of the subsequent processing block, suggesting something may be wrong with the smoothing itself. Although these artifacts are less common than those produced by the Speex wrapper, they are usually louder.
Libsamplerate can also be told not to smooth its filter changes (by setting the new ratio explicitly rather than updating it within the process structure), making it behave more like Speex.
The IPP wrapper produces awful artifacts on rate changes, well outside the scope of this report!
With libsamplerate there is a rather blunt tactic available (tested in
PERFORM_LIBSAMPLERATE_XFADEdefine) which is to duplicate the resampler on each rate change and feed a short prefix of the input to the "spare" resampler as well as to the main one. The main resampler is configured not to smooth its filter changes, so avoiding the mid-block discontinuities; the "spare" one is configured to do so; and we cross-fade the output from the spare one back to the main one at the start of the output block. This does basically work, but it depends on a new API symbol (
src_clone) added to libsamplerate since the current official release 0.1.9, and it introduces further allocations in the main process flow.
Fixed again in commit:7c6da77444c7 with a separate check for
Reopening; this breaks the iOS build (
make -f Makefile.ios).
MAC_OS_X_VERSION_MIN_REQUIREDis defined when building for iOS, with a value of 1050. But
vvfabfis not available on iOS (never has been, I think) and we should be using
vvfabsfthere. A test that also eliminates iOS is required!
Fixed in commit:af6005ff153f
Fixed in commit:0e5a0e58afd6
(Formerly Bitbucket issue 4, filed by RJ Skerry-Ryan)
Using librubberband 1.8.1 and current tip in realtime mode with all default settings (sample rate 44100) I can reliably cause a zero-division SIGFPE by setting the time ratio appropriately high.
An example time ratio is 6400. The input increment becomes 0. Since lots of places in the code divide by the input increment without checking for zero, this causes division-by-zero SIGFPEs.
RubberBandStretcher::Impl::calculateSizes():464, the input increment is assigned to be the output increment divided by the effective time ratio. For default settings, 44.1kHz and pitch ratio of 1.0 inputIncrement becomes 0 for a time ratio of 256 and higher. Using a larger FFT window size doubles this to 512 (since the FFT window size is in the numerator of the increment calculation).
I don't see any mention in the documentation of a maximum value for the time ratio or a way to check what the maximum value is given the window size. Is there a way to detect what the max time ratio should be? I could hack some heuristics in but since the maximum value is affected by the pitch ratio and the FFT window size I would end up having to build assumptions about RB's implementation into my code. I can also check for 0 inputIncrement and do a scan upwards to find the lowest value but this is also something I'd rather not do in an audio callback :).
For more info, here's the bug in our project: https://bugs.launchpad.net/ubuntu/+bug/1263233
(Formerly Bitbucket issue 5, reported by Michael Heuer)
LADSPA_LDFLAGSin the link line for the LADSPA plugin (not built by default) but the symbol is not actually defined anywhere. As a result the LADSPA plugin is built as an executable, which fails because it isn't intended to be one and has no