[IEEE 2005 Asia-Pacific Conference on Applied Electromagnetics - Johor, Malaysia (20-21 Dec. 2005)]...

4
2005IA-PAFR(CO(REIKOIAPWUWBERO A 5MaPRO(E)lIG6,Deceinber 20-21, 2005, Johor Bahru, Johor, MALAYSIA Secure Speech Communication over Public Switched Telephone Network Nuzli Mohamad Anas', Zainudin Rahman', Azman Shafiil, Muhamad Najib Abd Rahman', and Zainal Abidin Mat Amin' 'Telekom R&D Sdn Bhd 43400 Serdang, Selangor, Malaysia Abstract - This paper is incorporated to explain the work to design a secure telephone prototype namely "SecurePhone", with practical and reliable encryption, easy to operate and connects to Public Switched Telephone Networks (PSTN). SecurePhone is a device terminal that is designed to operate reliably, with high speech quality. It can acts as both ordinary telephones and secure instruments over the dial-up PSTN. Secure telephone operates in full duplex over a single telephone circuit using echo cancelling modem technology. 1. Introduction In this information age, individuals right must be protected from eavesdroppers. This include their legitimate personal and business transactions. We need some mechanism to ensure the privacy, authentication, and integrity in the voice communication. The SecurePhone significantly increases security against intruders on the Public Switched Telephone Network (PSTN). By using an encryption techniques, individuals can be reasonably sure and feel confident that no one or third parties eavesdrop the conversation. Moreover, each individual can be certain of the other party's identity. Figure 1: SecurePhone Connection SecurePhone composed of three main functional blocks: a speech coder, an encryption algorithm and a digital modem. SecurePhone equipped with 4.8 kbps improved code-excited linear prediction (ICELP) speech coder, which means that secure data can be transmitted at speed of 4.8 kbps. Data Encryption Standard (DES) is used to provide highly secure voice signal transmitted over the telephone network. The DES algorithm is a symmetric cryptosystem, which used same secret key to encrypt and decrypt the data. For voice transmission the modem is based on an adapted version of the V.32 algorithm. This modem works at full-duplex 9.6 kbps, using a Trellis Coded Modulation (TCM). Figure 2 : SecurePhone Algorithm Architecture. The paper is organized as follows: Section 2 gives a view of the algorithmic solutions adopted for the three main blocks which are speech coder, encryption and modem. Section 3 describes the SecurePhone architecture; Section 4 is dedicated to the SecurePhone operating modes. Finally, the design is conclude in the final section. 2. SecurePhone Algorithms The basic system components are better understood with the block diagram in Fig. 2. Speech signal is digitized at 8 kHz using an A-law codec before the signal goes through three main algorithms; Speech coder, encryption and modem. Such algorithms are explained in below. 2.2 Speech Coders Speech coders used in this system are 48 kbps improved code-excited linear prediction (ICELP) type, based on well-known CELP model and strongly optimized, ensuring significant improvement of speech quality at low computational complexity. The improved code-excited linear codec divides the speech into 30ms frames which in turn further divided into four 7.5 ms sub-frames. For each frame the encoder executes a set of 10 filter coefficients for the short term synthesis filter [5]. The excitation for this filter is determined for each sub-frame, and is given by the sum of scaled entries from two codebooks. An adaptive codebook is used to model the long term periodicities present in voiced 0-7803-9431-3/05/$20.00 02005 IEEE. 336

Transcript of [IEEE 2005 Asia-Pacific Conference on Applied Electromagnetics - Johor, Malaysia (20-21 Dec. 2005)]...

Page 1: [IEEE 2005 Asia-Pacific Conference on Applied Electromagnetics - Johor, Malaysia (20-21 Dec. 2005)] 2005 Asia-Pacific Conference on Applied Electromagnetics - Secure Speech Communication

2005IA-PAFR(CO(REIKOIAPWUWBERO A5MaPRO(E)lIG6,Deceinber 20-21, 2005, Johor Bahru, Johor, MALAYSIA

Secure Speech Communication over Public Switched Telephone Network

Nuzli Mohamad Anas', Zainudin Rahman', Azman Shafiil, Muhamad Najib Abd Rahman', andZainal Abidin Mat Amin'

'Telekom R&D Sdn Bhd43400 Serdang, Selangor, Malaysia

Abstract - This paper is incorporated to explain thework to design a secure telephone prototype namely"SecurePhone", with practical and reliable encryption,easy to operate and connects to Public SwitchedTelephone Networks (PSTN). SecurePhone is a deviceterminal that is designed to operate reliably, with highspeech quality. It can acts as both ordinary telephonesand secure instruments over the dial-up PSTN. Securetelephone operates in full duplex over a singletelephone circuit using echo cancelling modemtechnology.

1. Introduction

In this information age, individuals right must beprotected from eavesdroppers. This include theirlegitimate personal and business transactions. We needsome mechanism to ensure the privacy, authentication,and integrity in the voice communication. TheSecurePhone significantly increases security againstintruders on the Public Switched Telephone Network(PSTN). By using an encryption techniques,individuals can be reasonably sure and feel confidentthat no one or third parties eavesdrop the conversation.Moreover, each individual can be certain of the otherparty's identity.

Figure 1: SecurePhone Connection

SecurePhone composed of three main functionalblocks: a speech coder, an encryption algorithm and adigital modem. SecurePhone equipped with 4.8 kbpsimproved code-excited linear prediction (ICELP)speech coder, which means that secure data can betransmitted at speed of 4.8 kbps. Data EncryptionStandard (DES) is used to provide highly secure voicesignal transmitted over the telephone network. TheDES algorithm is a symmetric cryptosystem, whichused same secret key to encrypt and decrypt the data.For voice transmission the modem is based on anadapted version of the V.32 algorithm. This modem

works at full-duplex 9.6 kbps, using a Trellis CodedModulation (TCM).

Figure 2 : SecurePhone Algorithm Architecture.

The paper is organized as follows: Section 2 givesa view of the algorithmic solutions adopted for thethree main blocks which are speech coder, encryptionand modem. Section 3 describes the SecurePhonearchitecture; Section 4 is dedicated to the SecurePhoneoperating modes. Finally, the design is conclude in thefinal section.

2. SecurePhone Algorithms

The basic system components are betterunderstood with the block diagram in Fig. 2. Speechsignal is digitized at 8 kHz using an A-law codecbefore the signal goes through three main algorithms;Speech coder, encryption and modem. Such algorithmsare explained in below.

2.2 Speech CodersSpeech coders used in this system are 48 kbps

improved code-excited linear prediction (ICELP) type,based on well-known CELP model and stronglyoptimized, ensuring significant improvement of speechquality at low computational complexity. Theimproved code-excited linear codec divides the speechinto 30ms frames which in turn further divided intofour 7.5 ms sub-frames. For each frame the encoderexecutes a set of 10 filter coefficients for the shortterm synthesis filter [5].

The excitation for this filter is determined for eachsub-frame, and is given by the sum of scaled entriesfrom two codebooks. An adaptive codebook is used tomodel the long term periodicities present in voiced

0-7803-9431-3/05/$20.00 02005 IEEE. 336

Page 2: [IEEE 2005 Asia-Pacific Conference on Applied Electromagnetics - Johor, Malaysia (20-21 Dec. 2005)] 2005 Asia-Pacific Conference on Applied Electromagnetics - Secure Speech Communication

speech and for each sub-frame an index and a gain isdetermined for this codebook. A fixed codebookcontaining 512 pseudo-random codes is also searchedto find the codebook entry, and the gain multiplier forthis entry, which minimize the error between thereconstructed sample and the original speech samples.At the decoder the scaled entries from the twocodebooks are passed through the synthesis filter togive the reconstructed speech. Finally this speech ispassed through a post filter to improve its perceptualquality.

Table I : 4.8kbps ICELP Speech Coder Performance

Frame Algorithm Signal SBimSize Delay Input Form

Format

Linear3.7 30 ms 37.5 ms PCM 16 144 bits

bitNote 1: Mean opinion score (MOS) provides a numericalmeasure ofthe quality ofhuman speech. The scheme uses

subjective tests (opinionated scores) that are mathematicallyaveraged to obtain a quantitative indicator ofthe svstem

performance.

2.2 Enxryption TechniqueUnencrypted data is called plain text. Encryption

(also called enciphering) is the process of transformingplain text into cipher text. Decryption (also calleddeciphering) is the inverse transformation. Theencryption and decryption processes are performedaccording to a set of rules, called an algorithm, whichis typically based on a parameter called a key. The keyis usually the only parameter that must be provided toor by the users of a cryptographic system and must bekept secret.

A key consists of 64 binary digits of which 56 bitsare randomly generated and used directly by thealgorithm. The other 8 bits, which are not used by thealgorithm, are used for error detection. The 8 errordetecting bits are set to make the panty of each 8-bitbyte of the key odd, i.e., there is an odd number of"I"s in each 8-bit byte. Authorized users of encryptedcomputer data must have the key that was used toencipher the data in order to decrypt it. The unique keychosen for use in a particular application makes theresults of encrypting data using the algorithm unique.Selection of a different key causes the cipher that isproduced for any given set of inputs to be different.The security of the data depends on the key used toencipher and decipher the data.

DES core function encrypts and decrypts data in64-bit blocks, using a 64-bit key (although theeffective key strength is only 56 bits, as explainedabove). It takes a 64-bit block of plaintext as input andoutputs a 64-bit block of cipher text. Since it alwaysoperates on blocks of equal size and it uses both

permutations and substitutions in the algorithm, DESis both a block cipher and a product cipher [6].

DES has 16 rounds, meaning that the mainalgorithm is repeated 16 times to produce the ciphertext. It has been found that the number of rounds isexponentially proportional to the amount of timerequired to find a key using a brute-force attack. So asthe number of rounds increases, the security of thealgorithm increases exponentially.

2.3 ModemAs speech is to be sampled, a modem algorithm is

used to transmit data through the PSTN and the 9.6kbps rate is adopted because it assures a good qualityfor voice reconstruction. As for that, V.32 data modemalgorithm is used for asynchronous and synchronouswith two data rate of 4.8 and 9.6 kbps full-duplexmodems using TCM modulation over dial-up or two-wire leased lines. V.32 uses echo cancellation toachieve full-duplex transmission. Echo cancellersuppresses both near and far echoes with far echofrequency shift up to 14 Hz. Table II [5] shown belowis the performance of the V.32 Data ModemAlgorithm.

Table 2: V.32Data Modem Perfonnance

Parameter Value

Dynamic Range 55 dB

Frequency 10 Hzoffset

Baud frequency + 0.01 %offsetFar echo 14Hzfrequency offset

Far echo delay Up to 2 sec

Echo 60 dBSuppression

Note 1: Far Echo Delayparametermeans the maximum delay(in milliseconds) offar echo signal that could be suppressedby the modem. The modem reserves the buffer and saves thetransmitted signal to cancel it later. To save I second ofthesignal buffer size needs to be 2400 words.

Note 2: MIPS requirements are givenfor 14400 bps rate

Input signal is filtered via a band pass filter withHilbert transformer for eliminating the need for aseparate DC offset filter. Far echo delay parametermeans the maximum delay of far echo signal thatcould be suppressed by the modem. The modemreserves the buffer and saves the transmitted signal tocancel it later.

337

Page 3: [IEEE 2005 Asia-Pacific Conference on Applied Electromagnetics - Johor, Malaysia (20-21 Dec. 2005)] 2005 Asia-Pacific Conference on Applied Electromagnetics - Secure Speech Communication

3. SecurePhone Architecture

In this work, the SecurePhone has been designedand developed using fixed point DSP, thus TexasInstrument's (TI) TMS32OC54CST developmentboard has been elected. C54 CST (Client SideTelephony) chip contains a set of most demandedtelephony algorithms as well as a special CSTFramework, which ties them together and providesunified access to each of them. CST software waslicensed by Texas Instruments from Spirit Corp. It canbe controlled either externally via serial link using AT-commands or inside DSP using several different CSTcontrol layers ofCST Framework.

f --_.(NF- F-I _

Figure 3: General Hardware Setup of the CST chip

3.1 CST Software ComponentsSeveral components are included in CST Software

where each component is a standalone XDAIS-compliant algorithm. The components include datacommunication algorithms that provide V.32bis/V.32,V.22bis/V.22, and V.42NV.42bis standard for datamodulation purpose. DTMF detection and generationare provided in Telephony signals processing. It's alsocomes with Caller ID types I & II and Call ProgressTone Detection (CPTD) algorithm. As speechprocessing is needed, the CST Software providesG.168 line echo canceller, G.726 ADPCMcompression and G.711 PCM algorithm. Several othercomponents that includes are Voice Activity Detection(VAD), Automatic Gain Control (AGC) and ComfortNoise Generation (CNG) function.

There is also an integration shell (CSTFramework), which consists of several layers andforms very flexible and configurable framework. Eachframework layer has its own intermediate interface,with its own level of abstraction. CST Frameworkconsists ofAT Command Parser Layer which providesthe Data & Voice commands. On the other hand, CSTAction layer used to give the user control over CSTSolution as whole through mapping all commands andmessages to different CST sub layers. While the CSTCommander layer give the user control over CSTSolution through set of special command sequences,the CST Service layer provide data flow betweendifferent XDAIS components and device drivers, andto give the user unified access to CST XDAIScomponents through set of special messages.

Figure 4: CST Software Block Diagram

This CST Framework has built-in drivers and dataflow controllers of UART, DAA and Handset codecand also cater the memory management and othersystem services. The Framework was organized togive the user maximal flexibility. To achieve this,many Framework functions call one another viafunction pointers. This allows the user to overridethese functions as well as driver routines. User can stilldirectly use standalone XDAIS algorithms regardlessofthe CST Framework or use the Framework partially.Besides CST Software, there is also start up Bootloader and core code of DSP/BIOS ROMmed intoCST Chip.

Interconnection ofXDAIS algorithms inside CSTService layer is shown in Figure 5. The user cancontrol the way components are connected inside CSTFramework and what components are currently activevia AT commands or, in Flex mode, via messages toone ofCST control layers. CST Framework is flexibleand expandable, even though its code resides in ROM.

Figure 5: CST Solution Data Path [3]

Initially, CST framework interacts with host viaAT commands. This mode is used in chipset mode andis configured as default. General view of the CSTframework structure in initial configuration is shownin Figure 6 [3].

338

Page 4: [IEEE 2005 Asia-Pacific Conference on Applied Electromagnetics - Johor, Malaysia (20-21 Dec. 2005)] 2005 Asia-Pacific Conference on Applied Electromagnetics - Secure Speech Communication

I USERA~~ re~ull*1 Al c~,mrnan~ks

At rre iJI Alrnniraliads,tokens, data

. MUAR basd o

Figure 6: CST Framework Controlled via AT CommandParser [3]

User's code, loaded inside CST chip, must containnew function main0. This function should performinitialization and periodically call CST single threadfunction CSTSe,viceProcess0. Initially, the CSTframework does not use DSP-BIOS functionality, butcontains DSP-BIOS core in ROM.

4. SecurePhone Connectivity

This section describes operating procedure of theSecure Phone Version 1.0 and it is based on the CSTsolution. A secure phone is a system that allowstransmitting encrypted voice over PSTN. The audiosamples are first compressed by means of a voicecoding, and then they're encrypted and sent to anotherparty using a modem over PSTN. The receiving sidedecrypts the received signal and then uncompressed it.This uncompressed voice will then be converted tooriginal voice signal. Secure Phone requires aminimum of 9600 bps data bit rate line. The operationis full duplex, just like on a regular phone.

The ATD command should be key-in followed bythe phone number of another party. When that partyreceives RNlVG command, then it has to response withATA command. If both parties succesfully connected,CONNECT 9600 will appear on both screen, whichnotify that voice communication can be started.

This voice communication setup has a predefinedKEY for encryption. In order to implement a secretkey, one should issue a new 42 hexadecimal key onboth boards before ATD or ATA command. The KEYshould be the same for both parties.

For the purpose of comparison, a third telephonecan be placed in parallel in between the twocommunicating telephones. While the twocommunicating telephone can still operates as normal,the third telephone will only hear noise or hissingsound.

5. Summary

In this paper, implementation of a SecurePhonehas been developed as a device terminal capable oftransmiting secured voice signal through PSTN. Wealso able to demonstrate the working prototype usingTexas Instruments C54CST chip. Although there areother security devices in the market, it has someadvantage, as it is locally design which is useful toprotect restricted information.

References

[1] Calpe, J.; Magdalena, J.R.; Guerrero, J.F.;Frances, J.V., "Toll-quality digital secraphone"Electrotechnical Conference, 1996. MELECON'96, 8th Mediterranean Volume 3, 13-16 May1996 Page(s):1714 - 1717 vol.3

[2] [2] Diez-Del-Rio, L.; Moreno-Perez, S.;Sarmiento, R.; Parera, J.; Veiga-Perez, M.;Garcia-Gomez, R., "Secure speech and datacommunication over the public switchingtelephone network", Acoustics, Speech, andSignal Processing, 1994. ICASSP-94., 1994 IEEEInternational Conference onVolume ii, 19-22 April 1994 Page(s):11/425 -II/428 vol.2

[3] Texas Instrument's SPRU029A "Client SideTelephony (CST) Chip Software User's Guide"

[4] Zainuddin Rahman, "Flex Mode ProgrammingTMS32OC54CST (Client Side Telephony)Evaluation Module" Technical Report

[5] http://www.spiritdsp.com/[6] http://www.imaginetechnology.netI

339