Audio Forensics Exonerates Guilty-Looking Suspect

Audio forensics or Forensic Audio Analysis, being a scientific field of endeavor, sometimes involves trial & error or the use of experimental techniques to solve a case. This was one such case. Faced with challenges for which there was no generally accepted solution, we set out to find one, and it succeeded.

Audio Forensics. What is it?

Audio forensics is a specialized field of scientific investigation that is growing in popularity as the world becomes more digitized. The number of cases in which voice or sound recordings has been relied upon as evidence has increased steadily over the years, and so has our audio forensics capability. The most common audio forensics tasks were asked to perform are audio enhancement (improving the clarity and intelligibility of audio by filtering out unwanted sound and interference) and voice comparisons (comparing known and questioned copies of a person’s recorded voice in order to determine whether that person is the speaker in both). This particular case used neither of these.

Case Background

Our client, a company, had given certain employees laptop computers to use for business purposes. Personal use of these laptops was allowed provided it did not violate the company’s Acceptable Use guidelines. During a routine security audit, our client’s IT department discovered pornographic images stored on the hard drive as well as internet history logs that showed frequent visits to numerous porn and other inappropriate websites. The employee was confronted but denied knowing anything about the issue and claimed that another employee must have used his laptop (which he stated was usually left at the office). The matter was not taken any further as the client’s attorney advised that more concrete evidence would be needed to link the employee with the activity in question. In consultation with the attorneys, we installed computer monitoring software on all company laptops and workstations.

Investigation Trigger

At this stage, we (the audio forensics team) were not even aware of the matter, and nobody suspected that it would require audio forensics at all. The situation was being handled by our cyber investigators with the assistance of computer forensics staff. At that time, all employees with laptops had officially been notified of the company’s intention to closely and actively monitor their internet and computer activity, and every staff member had signed an updated agreement regarding the appropriate and prohibited use of company laptops and IT infrastructure. Our client had obviously felt that this would be sufficient to deter further infractions. What wasn’t known at the time was that our client had decided to scale back the intensity of their monitoring efforts (because they were concerned about its invasiveness) and their IT department was instructed to deactivate all but a few monitoring features. Little did they know that this action would almost cost them an opportunity to uncover the truth and take appropriate action. That very evening, the same laptop was used for around 9 hours to access over 100 websites and download over 5 gigabyte of data. Our client was not yet aware of the events of that evening, and was only alerted to there being an issue when, the next day, the employee (who’s laptop it was) was involved in an argument with a co-worker regarding the laptop. Apparently, the employee in question arrived at work and discovered that his laptop would no longer start up. It just hung after the initial BIOS post.

Computer Forensics Comes Up Short

The client’s IT department took possession of the laptop, and contacted our forensic team. They established a proper chain of custody and set about securing the device for a computer forensic examination. I’ll skip the details here because I’m not a computer forensic expert, but the long and short of it was that the laptop had essentially been wiped clean by someone. It was obviously believed to be a deliberate attempt to destroy evidence. Fortunately, the little data that our monitoring software had been set to collect was not lost. It had already been uploaded to the client’s server and was intact (and it contained all appropriate meta data and cryptographic hashes to allow proper authentication). The bad news was that only three features were active (and 2 of those 3 were only active by chance). The client had only intended to turn on URL logging which shows the date/time, current page and target page for any hyperlink or anchor clicked or followed, and the arrival/departure date/time, duration of visit, number of on-page actions (any interaction with the page itself) and the full URL/URI. That wouldn’t have been enough to prove who was using the laptop at the time. The other two features that were active were the parental controls (for receiving notifications via SMS or email whenever keywords, contacts or content of concern is accessed or mentioned, and the Alert Capture feature that records screen shots, web cam snapshots and audio from the microphone for a set period whenever a parental control alert is triggered. Because these features weren’t meant to be turned on, they were not set up properly and no keywords (websites, names, addresses etc) were added to the watchlist. By default the feature is set to trigger an alert when profane language is used, and surprisingly, only one really “bad” word was used in the domain name of one of the visited sites, and the screenshots, web cam clips and recorded audio were available for analysis. It was a day or two later that we were asked to examine the audio recording for any evidence.

Audio Forensics Begins

The employee had placed a sticker over the webcam lens as he was concerned that the webcam could be remotely activated and had concerns about his privacy. That is completely understandable given what we know about current surveillance capabilities. This unfortunately meant that nothing but a black image was captured from the camera. The screenshots, while distasteful at best, didn’t provide any evidence that could be used to identify a particular individual as being the user. Apart from web surfing, the person using the laptop at the time never visited a website that required him or her to log in, never checked their email or did anything that required them to identify themselves or type something that could be used to infer their identity. The audio recordings (at first “glance”) also seemed to be useless. A total of 14 mins and 33 seconds of almost silence. There was the occasional clicking of keyboard keys, mouse/touchpad clicks, the occasional background noises, arbitrary sounds made by the user (like the clearing of a throat and a nasal sniff) – but nothing immediately discernible – no talking, no recognizable background sounds, no familiar ambient noises. Not much to go on, until we looked a little closer, and listened a lot more intently.

Audio Clarification and Enhancement

In the audio forensics business, much time is spent enhancing audio or removing interference and cancelling out background noises from recordings. This case had trying every conceivable application, method and technique. We had almost hundred copies of the recording – each processed in a different way – and each had to be listened to, analyzed and interpreted. After hours of listening we had each identified specific periods of time that we thought warranted further investigation. When we came together as a team we went through our individual findings and settled on a handful that were common to all of us. On three occasions we heard what seemed to be DTMF tones (the boop-beep-baap-boop sounds one might hear when dialing a telephone), and at two points in time we could hear a telephone ringing and what sounded like an answering machine beep (no speech was audible).

Focusing on the DTMF

We couldn’t be entirely sure that the sounds we could hear were DTMF tones, but having also counted the number of “beeps” on each occasion we were fairly certain that our hunch was correct. Every time we could hear the tones, there were ten of them. That tallies with the minimum number of digits one needs to dial on most phone networks (ignoring special numbers like 10111 or 911).

Decoding the DTMF

The following source code was graciously provided to us by our software developers (the same guys who created CellTrack, CyberSpy, etc). I don’t understand one line of it, but I thought to include it here in case anyone out there was interested in trying it out.

USE DFLIB  
IMPLICIT REAL*8(A-H), REAL*8(O-Z)
PARAMETER (MAXPTS = 6615000) 
COMPLEX*16, ALLOCATABLE    :: Y1(:)
CHARACTER*256 FILEIN, ERMSG
PARAMETER(HAFPI=1.570796326794896D0)
PARAMETER   (PI=3.141592653589793D0)
PARAMETER(TWOPI=6.283185307179586D0)
PARAMETER(TC = .023D0)  
CHARACTER*1 MATRIX(4,4)  
CHARACTER*2048 STRING  
CHARACTER*64 WSTR

STRUCTURE /STRUCT/
REAL*8 W  
REAL*8 I  
REAL*8 Q  
REAL*8 A  
END STRUCTURE

RECORD /STRUCT/  X(8)  
TYPE (FILE$INFO) FINFO
INTEGER(4) HANDLE, FLENGTH

ALLOCATE(Y1(MAXPTS), STAT=IERR)
IF(IERR .NE. 0) THEN 
	WRITE(6,*)'AN ERROR HAS OCCURRED'
	WRITE(6,*)'PRESS ENTER TO CLOSE'
	READ(5,*)
	STOP
END IF

MATRIX(1,1) = '1' ! 697 HZ, 1209 HZ
MATRIX(1,2) = '2'
MATRIX(1,3) = '3'
MATRIX(1,4) = 'A' ! 697 HZ, 1633 HZ
MATRIX(2,1) = '4' ! 770 HZ, 1209 HZ
MATRIX(2,2) = '5'
MATRIX(2,3) = '6'
MATRIX(2,4) = 'B' ! 770 HZ, 1633 HZ
MATRIX(3,1) = '7' ! 852 HZ, 1209 HZ
MATRIX(3,2) = '8'
MATRIX(3,3) = '9'
MATRIX(3,4) = 'C' ! 852 HZ, 1633 HZ
MATRIX(4,1) = '*' ! 941 HZ, 1209 HZ
MATRIX(4,2) = '0'
MATRIX(4,3) = '#'
MATRIX(4,4) = 'D' ! 941 HZ, 1633 HZ

X(1).W = 697
X(2).W = 770
X(3).W = 852
X(4).W = 941
X(5).W = 1209
X(6).W = 1336
X(7).W = 1477
X(8).W = 1633

DO K = 1,8  
	X(K).W = X(K).W*TWOPI 
END DO
IERR = SETEXITQQ (QWIN$EXITNOPERSIST)  
FILEIN = ' ' 
CALL GETARG(INT2(1), FILEIN) 
IF(FILEIN.EQ.' ') THEN 
	WRITE(6,*)'PLEASE SELECT AN INPUT FILE'
	WRITE(6,*)'PRESS ENTER TO CONTINUE'
	READ(5,*)
	STOP
END IF 
HANDLE = FILE$FIRST
FLENGTH = GETFILEINFOQQ(TRIM(FILEIN), FINFO, HANDLE)
IF(FLENGTH.EQ.0) THEN 
	WRITE(6,*)'FILE NOT FOUND'//TRIM(FILEIN)
	WRITE(6,*)'PRESS ENTER TO CONTINUE'
	READ(5,*)
	STOP
END IF
WRITE(6,*)'READING FILE'
CALL WAVREAD(Y1,FILEIN,MAXPTS,NPTS,XMIN,DX,KERR,ERMSG)
IF(KERR .NE. 0) THEN 
	WRITE(6,*) TRIM(ERMSG)
	WRITE(6,*)'PRESS ENTER TO CONTINUE'
	READ(5,*)
	STOP
END IF
WRITE(6,*)'PROCESSING FILE'
STRING = ' '
JSTR = 0
DFRACT = 2.7182818D0 ** (-DX/TC)  
ITONE = 0 
AMPMAX = 0.D0
AMPPEAK = 0.D0
XXI = 0
XXQ = 0
DO J = 1,NPTS  
	T = (J-1)*DX
	DO K = 1, 8  
		XXI = XXI*DFRACT + COS(X(K).W*T)*DREAL(Y1(J))   
		XXQ = XXQ*DFRACT + SIN(X(K).W*T)*DREAL(Y1(J))   
		XXA = XXI ** 2 + XXQ ** 2   
	END DO
	IF(XXA .GT. AMPPEAK) AMPPEAK = XXA
END DO
AMPTHRESH = .25 * AMPPEAK  
AMPOFF = .1 * AMPPEAK
TOFF = -10.
DO J = 1,NPTS  
	T = (J-1)*DX
	DO K = 1, 8  
		X(K).I = X(K).I*DFRACT + COS(X(K).W*T)*DREAL(Y1(J))  
		X(K).Q = X(K).Q*DFRACT + SIN(X(K).W*T)*DREAL(Y1(J))   
		X(K).A = X(K).I ** 2 + X(K).Q ** 2  
	END DO
	INDAMAX = 0   
	INDA2ND = 0
	AMPAMAX = 0.D0
	AMPA2ND = 0.D0
	DO K = 1,4
		IF(X(K).A .GT. AMPAMAX) THEN
			AMPAMAX = X(K).A
			INDAMAX = K
		END IF
	END DO
	DO K = 1,4
		IF(K.NE.INDAMAX .AND. X(K).A .GT. AMPA2ND) THEN
			AMPA2ND = X(K).A
			INDA2ND = K
		END IF
	END DO
	INDBMAX = 0   
	INDB2ND = 0
	AMPBMAX = 0.D0
	AMPB2ND = 0.D0
	DO K = 5,8
		IF(X(K).A .GT. AMPBMAX) THEN
			AMPBMAX = X(K).A
			INDBMAX = K 
		END IF
	END DO
	DO K = 5,8
		IF(K.NE.INDBMAX .AND. X(K).A .GT. AMPB2ND) THEN
			AMPB2ND = X(K).A
			INDB2ND = K
		END IF
	END DO
	AMPMAX = MAX(AMPMAX, AMPAMAX, AMPBMAX)  
	IF(ITONE .EQ. 0) THEN  
		IF(AMPAMAX .GE. AMPTHRESH .AND. AMPBMAX .GE. AMPTHRESH) THEN   
			IF(AMPMAX .GE. 3.D0*AMPOFF) THEN  
				IF(AMPAMAX .GE. 10.D0*AMPA2ND .AND. AMPBMAX .GE. 10.D0*AMPB2ND) THEN  
					ITONE = 1
					IF(T .GT. TOFF + 1.D0) THEN  
						JSTR = JSTR + 1 
						STRING(JSTR:JSTR) = '|'  
					END IF
					JSTR = JSTR + 1
					STRING(JSTR:JSTR) = MATRIX(INDAMAX, INDBMAX - 4)
				END IF
			END IF
		END IF
	ELSE 
		AMPNOW = MAX(AMPAMAX, AMPBMAX)
		IF(AMPNOW .LE. .135335*AMPMAX) THEN  
			ITONE = 0
			TOFF = T
			AMPOFF = AMPNOW
			AMPMAX = AMPNOW
		END IF
	END IF
END DO
NDX = 2  
DO K = 2, JSTR+1
	IF(K.EQ.JSTR+1 .OR. STRING(K:K) .EQ. '|') THEN
		IF(K-NDX .EQ. 11) THEN
			WSTR = STRING(NDX:NDX)//"-"//STRING(NDX+1:NDX+3)//"-"//STRING(NDX+4:NDX+6)
                        //"-"//STRING(NDX+7:NDX+10)
			ELSE IF(K-NDX .EQ. 10) THEN
			WSTR = STRING(NDX+0:NDX+2)//"-"//STRING(NDX+3:NDX+5)//"-"//STRING(NDX+6:NDX+9)
			ELSE IF(K-NDX .EQ. 7) THEN
			WSTR = STRING(NDX+0:NDX+2)//"-"//STRING(NDX+3:NDX+6)
			ELSE
			WSTR = STRING(NDX:K-1)
		END IF
		WRITE(6,*) TRIM(WSTR)
		NDX = K+1
	END IF
END DO
WRITE(6,*)
WRITE(6,*)'PRESS ENTER TO CLOSE'
READ(*,*)
END
SUBROUTINE WAVREAD(A,FNAM,MAXPTS,NPTS,XMIN,DX,KERR,ERMSG)
	COMPLEX*16 A(*)
	CHARACTER*(*) FNAM
	INTEGER MAXPTS
	INTEGER NPTS
	REAL*8 XMIN
	REAL*8 DX
	INTEGER KERR
	CHARACTER*(*) ERMSG
	LOGICAL XST, OPN
	CHARACTER*1 CDUM
	INTEGER*2 IBUF(128)
	CHARACTER*4   ChunkID 
	INTEGER*4   ChunkSize 
	CHARACTER*4   wFormat  
	CHARACTER*4   Subchunk1ID  
	INTEGER*4   Subchunk1Size  
	INTEGER*2   AudioFormat    
	INTEGER*2   NumChannels    
	INTEGER*4   SampleRate    
	INTEGER*4   ByteRate       
	INTEGER*2   BlockAlign     
	INTEGER*2   BitsPerSample  
	CHARACTER*4   Subchunk2ID  
	INTEGER*4   Subchunk2Size  
	KERR = 0
	ERMSG = ' '
	DO LU = 99,1,-1
		INQUIRE(LU,OPENED=OPN)
		IF(.NOT.OPN) GO TO 30
	END DO

30		OPEN(LU,FILE=FNAM,MODE='READ',FORM='BINARY',ERR=998) 
		READ(LU)ChunkID,ChunkSize,wFormat,Subchunk1ID,Subchunk1Size,
                +AudioFormat,NumChannels,SampleRate,ByteRate,BlockAlign,+BitsPerSample
		IF(ChunkID.NE.'RIFF' .OR. wFormat.NE.'WAVE') THEN
			ERMSG = 'UNRECOGNIZED FORMAT'
			KERR = 2
			RETURN
		END IF 
		IF(Subchunk1Size.GT.16) THEN
			DO JJ = 17, Subchunk1Size
				READ(LU)CDUM
			END DO
		END IF
		READ(LU)Subchunk2ID,Subchunk2Size
		NPTS = MIN(MAXPTS, Subchunk2Size/(NumChannels * BitsPerSample/8)) 
		XMIN = 0.
		DX = 1.D0/DFLOAT(SampleRate)
		NPTX = 0 
100		NPREAD = MIN (NPTS-NPTX, 64)
		IF(NPREAD.LE.0) THEN
			CLOSE(LU)
			RETURN
		END IF
		IF(NumChannels.EQ.1) THEN  
			READ(LU,ERR=999,END=999) (IBUF(JJ), JJ = 1, NPREAD)
			DO JJ = 1, NPREAD
				NPTX = NPTX + 1
				A(NPTX) = DCMPLX(IBUF(JJ),0.D0)
			END DO
		ELSE 
			READ(LU,ERR=999,END=999) (IBUF(JJ), JJ = 1, 2*NPREAD)
			DO JJ = 1, 2*NPREAD, 2
				NPTX = NPTX + 1
				A(NPTX) = DCMPLX(IBUF(JJ),IBUF(JJ+1))
			END DO
		END IF
		GO TO 100
998  	ERMSG='Cannot find file: '//FNAM(1:LEN_TRIM(FNAM))
		KERR = 1
		CLOSE(LU)
		RETURN
999   	ERMSG='Error reading file: '//FNAM(1:LEN_TRIM(FNAM))
		KERR = 1
		CLOSE(LU)
		RETURN
		END

To be continued

About The Author

Add Comment

LiveZilla Live Chat Software