Given that this is a technical area of the forum...
First off spoken word recordings can be far smaller than music recordings,
Yes, spoken audio is more forgiving than music. You will notice the difference in quality, and you won't want to use it if it has to sound good, but the audio will still be understandable. However...
there is a far smaller range and depth of frequencies.
This is not true. Voice occupies high end frequencies, and high end frequencies are what suffer first when you use inferior file formats. There is a narrower range for the fundamental frequencies, but there's much more to a recording than just fundamental frequencies.
That said, you can lose the higher frequencies and the audio will still be understandable -- it will also sound worse.
Also related is the sample rate. Significant savings can be made by reducing the number of samples per second. I'm not saying that PS should go for the cheapest quality audio but that it does not need to be over produced.
When you say "over produced", do you mean "high fidelity"?
Bit rate and bit resolution are what you would play around with to make a file smaller. Voice audio at 8 bit resolution and 6khz sampling rate is still understandable. (CD audio has 16 bit resolution and 44.1 khz sampling rate.) Whether or not it's "acceptable" though depends on the application.