[Fusionforge-general] Testsuite timeouts

Sylvain Beucler - Inria sylvain.beucler at inria.fr
Wed Nov 4 19:16:37 CET 2015


Hi,

Le 03/11/2015 19:09, Sylvain Beucler - Inria a écrit :
> After debugging stalled processes from our testsuite and prod, I 
> highly suspect that the timeouts come from nss/nscd (see attached 
> backtrace w/ debugging symbols):
>
> - GDB shows they are stuck in a libnss-pgsql2 deadlock, as described in:
> http://lists.fusionforge.org/pipermail/fusionforge-general/2014-March/002631.html 
>
> However since nscd is running, the process shouldn't even enter 
> libnss-pgsql, so timeouts happen during a random nscd failure.
>
> - GDB shows libpq checks the requestor UID *to locate the .pgpass 
> file* (not to authenticate the username, since our nss-pgsql.conf 
> specifies it explicitly). Fortunately this can be bypassed like:
> # service unscd stop
> # su admin -c id
> <stalls...>
> # PGPASSFILE= su admin -c id
> uid=20102(admin) gid=100(users) 
> groupes=100(users),10006(tmpl),10007(projecta),1.
>
>
> So short of debugging unscd, and short of modifying libpq so it stops 
> using getpw* when used from nss, we can set PGPASSFILE in various 
> daemons (apache scm config at least, possibly ssh/shell too).
>
> What do you think?

Updates:

- If the server is remote, getpw* is also called to look for SSL-related 
files in the user directory (???).
   Additional workaround: set sslmode=disable in nss-pgsql.conf

- To apply PGPASSFILE='' in apache, SetEnv/SetEnvIf are inoperant.
   On Debian, using /etc/apache2/envvars works (i.e. no need for unscd).

I'm considering adding PGPASSFILE in the testsuite's apaches and see if 
that helps.

Cheers!
Sylvain



More information about the Fusionforge-general mailing list