Currently VGZ files are decompressed before playback starts. This can take a while, especially on Z80, however it still takes less time than it takes to play them back, and during playback the CPU is usually idling a lot.
If the files would be decompressed during the playback idle time, with a relatively small buffer (16K ought to be enough I think), playback could start a lot sooner.
This would mean that VGMPlay needs to move to a pipelined processing method, which would probably involve something like coroutines.
Note GD3 tags are unfortunately located at the end of the file, so they will appear well after playback has started. Additionally, DOS text I/O is slow and would disturb the playback. This may be easier to deal with when VGMPlay has a UI which directly accesses the VDP (#12).
Additionally some VGMs use CPU-controlled PCM playback which means there is no time to decompress on the fly. This case can be detected by the presence of certain data blocks, in which case it needs to fall back on loading the entire file into memory.