[pjsip] Deadlock when snd-auto-close=1 (the default)

Scott M Ober smo at doclinx.com
Thu Apr 21 14:48:37 EDT 2016


I'm running PJSUA from PJSIP version 2.4.5 on both a Mac and on Linux,

I've modified the PJSUA to be a test app for our SIP servers.
I've added a thread that makes calls from all configured SIP URI's to a test number.

A call is made and when the DISCONNECTED state happens it starts the next call after waiting 3 seconds.

This works for about 6 hours typically when a deadlock occurs.

The deadlock always occurs because the close_snd_timer_cb has obtained a PJSUA_LOCK and some flaw in the code
in pa_unix_util.c PaUnixThread_Terminate is waiting for a join that never happens.

The setting for snd-auto-close is 1 (the default)

I'm retrying the system with snd-auto-close set to -1 and will relay the results.
(setting it to -1 works after 2 days)

It seems dangerous to keep the entire system waiting for a sound device to close and a thread to stop on a system wide lock.

Has anyone else experienced this problem?

-- Scott.

-- Code where lock is obtained

pjsua_aud.c

/* Timer callback to close sound device */
static void close_snd_timer_cb( pj_timer_heap_t *th,
				pj_timer_entry *entry)
{
     PJ_UNUSED_ARG(th);

     PJSUA_LOCK();
     if (entry->id) {
	PJ_LOG(4,(THIS_FILE,"Closing sound device after idle for %d second(s)",
		  pjsua_var.media_cfg.snd_auto_close_time));

	entry->id = PJ_FALSE;

	close_snd_dev();
     }
     PJSUA_UNLOCK();
}

-- GDB backtrace of thread which owns the lock


(gdb) bt
#0  0x00007ffff707c66b in pthread_join (threadid=140737260861184, thread_return=0x7ffff3f0d8a8) at pthread_join.c:92
#1  0x000000000055a4fb in PaUnixThread_Terminate (self=0x7fffec003120, wait=1, exitResult=0x7ffff3f0d8e8) at src/../../../portaudio/src/os/unix/pa_unix_util.c:433
#2  0x0000000000561216 in RealStop (stream=0x7fffec002fa0, abort=0) at src/../../../portaudio/src/hostapi/alsa/pa_linux_alsa.c:2251
#3  0x00000000005612ca in StopStream (s=0x7fffec002fa0) at src/../../../portaudio/src/hostapi/alsa/pa_linux_alsa.c:2279
#4  0x00000000005599ad in Pa_StopStream (stream=0x7fffec002fa0) at src/../../../portaudio/src/common/pa_front.c:1423
#5  0x00000000004da37c in strm_stop (s=0x7fffec001b58) at ../src/pjmedia-audiodev/pa_dev.c:1239
#6  0x00000000004d7b8e in pjmedia_aud_stream_stop (strm=0x7fffec001b58) at ../src/pjmedia-audiodev/audiodev.c:838
#7  0x00000000004b6ac7 in stop_sound_device (snd_port=0x7fffec000bc8) at ../src/pjmedia/sound_port.c:347
#8  0x00000000004b709b in pjmedia_snd_port_destroy (snd_port=0x7fffec000bc8) at ../src/pjmedia/sound_port.c:538
#9  0x0000000000448f10 in close_snd_dev () at ../src/pjsua-lib/pjsua_aud.c:1902
#10 0x0000000000445ca7 in close_snd_timer_cb (th=0x920ee8, entry=0x8fe580 <pjsua_var+591008>) at ../src/pjsua-lib/pjsua_aud.c:448
#11 0x0000000000589e90 in pj_timer_heap_poll (ht=0x920ee8, next_delay=0x7ffff3f0de50) at ../src/pj/timer.c:643
#12 0x00000000004752f9 in pjsip_endpt_handle_events2 (endpt=0x920bc8, max_timeout=0x7ffff3f0de90, p_count=0x7ffff3f0de88) at ../src/pjsip/sip_endpoint.c:712
#13 0x000000000043386e in pjsua_handle_events (msec_timeout=10) at ../src/pjsua-lib/pjsua_core.c:1856
#14 0x000000000043139d in worker_thread (arg=0x0) at ../src/pjsua-lib/pjsua_core.c:696
#15 0x00000000005739f3 in thread_main (param=0x984e48) at ../src/pj/os_core_unix.c:541
#16 0x00007ffff707b182 in start_thread (arg=0x7ffff3f0e700) at pthread_create.c:312
#17 0x00007ffff663a47d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
  

-- Screen Output


[Thread 0x7ffff270b700 (LWP 32303) exited]
[New Thread 0x7ffff270b700 (LWP 32312)]
[Thread 0x7ffff270b700 (LWP 32312) exited]
[New Thread 0x7ffff270b700 (LWP 32317)]
[Thread 0x7ffff270b700 (LWP 32317) exited]
[New Thread 0x7ffff270b700 (LWP 32326)]
[Thread 0x7ffff270b700 (LWP 32326) exited]
[New Thread 0x7ffff270b700 (LWP 32335)]
[Thread 0x7ffff270b700 (LWP 32335) exited]
[New Thread 0x7ffff270b700 (LWP 32344)]
[Thread 0x7ffff270b700 (LWP 32344) exited]
[New Thread 0x7ffff270b700 (LWP 32351)]
13:12:49.382 tcpc0x7fffe800  TCP connect() error: Connection timed out [code=120110]
13:12:49.382    pjsua_app.c  SIP TCP transport is disconnected from [54.252.148.80:5062]: Connection timed out [status=120110]
[Thread 0x7ffff270b700 (LWP 32351) exited]
[New Thread 0x7ffff270b700 (LWP 32363)]
[Thread 0x7ffff270b700 (LWP 32363) exited]
[New Thread 0x7ffff270b700 (LWP 32370)]
[Thread 0x7ffff270b700 (LWP 32370) exited]
[New Thread 0x7ffff270b700 (LWP 32379)]
13:14:07.605 tcpc0x7fffe800  TCP connect() error: Connection refused [code=120111]
13:14:07.605    pjsua_app.c  SIP TCP transport is disconnected from [182.23.65.18:5062]: Connection refused [status=120111]
13:14:15.927  pjsua_media.c  ......pjmedia_transport_media_start() failed for call_id 0 media 0: Invalid SRTP crypto attribute (PJMEDIA_SRTP_ESDPINCRYPTO)
13:14:15.927  pjsua_media.c  ......Error updating media call00:0: Invalid SRTP crypto attribute (PJMEDIA_SRTP_ESDPINCRYPTO)
13:14:15.927   pjsua_call.c  .....Unable to create media session: No active media stream after negotiation (PJMEDIA_SDPNEG_ENOMEDIA) [status=220048]
[Thread 0x7ffff270b700 (LWP 32379) exited]
[New Thread 0x7ffff270b700 (LWP 32382)]
[Thread 0x7ffff270b700 (LWP 32382) exited]
[New Thread 0x7ffff270b700 (LWP 32391)]
[Thread 0x7ffff270b700 (LWP 32391) exited]
[New Thread 0x7ffff270b700 (LWP 32398)]
[Thread 0x7ffff270b700 (LWP 32398) exited]
[New Thread 0x7ffff270b700 (LWP 32405)]
[Thread 0x7ffff270b700 (LWP 32405) exited]
[New Thread 0x7ffff270b700 (LWP 32414)]
[Thread 0x7ffff270b700 (LWP 32414) exited]
[New Thread 0x7ffff270b700 (LWP 32432)]
13:17:18.433  pjsua_media.c  ......pjmedia_transport_media_start() failed for call_id 2 media 0: Invalid SRTP crypto attribute (PJMEDIA_SRTP_ESDPINCRYPTO)
13:17:18.433  pjsua_media.c  ......Error updating media call02:0: Invalid SRTP crypto attribute (PJMEDIA_SRTP_ESDPINCRYPTO)
13:17:18.434   pjsua_call.c  .....Unable to create media session: No active media stream after negotiation (PJMEDIA_SDPNEG_ENOMEDIA) [status=220048]
[Thread 0x7ffff270b700 (LWP 32432) exited]
[New Thread 0x7ffff270b700 (LWP 32437)]
[Thread 0x7ffff270b700 (LWP 32437) exited]
[New Thread 0x7ffff270b700 (LWP 32450)]

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.pjsip.org/pipermail/pjsip_lists.pjsip.org/attachments/20160421/8e165f49/attachment-0002.html>


More information about the pjsip mailing list