Media-Independent Presentation Language
MIPL Implementation and Deployment Notes
Much like HTML, MIPL script consists of Free Text and Tags. Tags are specified within, less than "<" and greater than ">", as is done in HTML..
Unlike HTML, MIPL script has the concept of a "session". When a MIPL script gets to its end, with nothing more to do, or hits an <EXIT> tag, the browser, or at least its session exits, and the user is disconnected, if appropriate.
MIPL browsers do not, and are not to have a concept of things like "back" or "forward" buttons, or any kind of navigation buttons. Navigation within a MIPL browser is accomplised exclusively by the use of prompts within the script.
When a script is run, all Free Text, text not within tags and/or associated voice, is output first, in the order specified. References to voice are tags within, and/or referring to, the Free Text.
The well established URI types, http: and file:, are used extensively within MIPL. MIPL enabled browsers used in different environments implement new and somewhat exotic URI types, such as, for example, a URI that allow transfer of a phone call.
When writing MIPL script or deploying MIPL, one should always remember that it is just completely unknown as to what kind of hardware platform, software platform, or operating environment they are going to be used with. MIPL script, should be written to be compatible to all circumstances. Chances are that MIPL will be used with and implemented in many new ways, beyond what the author has conceived.
URLs in MIPL, as well as within standard HTML, may contain page-relative tags. The page-relative label is preceded by a number-sign (#), for example: http://www.foo.com/baz.html#main.
After much consideration and experience in implementation, we have decided to make the standard audio media type be 4 bit ADPCM .vox format (Oki Semiconductor version) with a sampling rate of 8khz (with the sampling rate specified by the <VOICERATE> tag). This was based upon the following:
4 bit ADPCM is about the most compression that can be applied to a telephone-bandwidth voice signal without unacceptable degradation.
As of this publishing, all (without exception) of the popular PC-type telephony hardware platforms (at the moment, PC-type platforms are the only cost-effective way of implementing telephony applications in any reasonable density) support this format natively. It therefore, requires no host overhead for format conversion/decompression. Also, on platforms that do not support it natively, the conversion is trivial and uses quite minimal resources. The format is non-proprietary, and conversion routines are freely and readily available.
Even if lossy compression techniques, such as MPEG or GSM, were of acceptable sound quality (which they are NOT), it would still require staggering amounts of host processor overhead for conversion, but certainly would use less network bandwidth. With reasonable Cacheing techniques, the potential bandwidth savings is not even a terribly significant issue (since we're only talking about a 2:1 or 3:1 savings at the cost of major processor overhead and sound quality).
There still is a lot of legacy PC telephony hardware available on the used market, which may be used with excellent results and performance and is available at extremely low cost. This hardware generally only supports this format (sometimes, even only at 6khz). A good-sized system based upon this type of hardware, supporting up to an entire T-1 (24) of voice channels may be implemented on a ridiculously low-end processor (like a 486), thus readily allowing individuals on a very low budget to develop and deploy MIPL-based technologies.
If bandwidth is a significant issue, don't transfer audio at all. Use speech synthesis. MIPL is very good at using only TINY amounts of bandwidth when only transferring text. Pre-recorded audio is only appropriate in environments where there is bandwidth available to support it.
Copyright © by H.R. Jim Dixon 1999 All Rights Reserved