View Issue Details

IDProjectCategoryView StatusLast Update
0000623tcshgeneralpublic2017-07-03 08:58
ReporterJohn GALLET 
Assigned To 
PrioritynormalSeverityminorReproducibilityalways
Status newResolutionopen 
PlatformIntelOSLinuxOS Version
Product Version6.20.00 
Target VersionFixed in Version 
Summary0000623: Unexpected high CPU load after tcsh version 6.13
DescriptionHi,
We use in production a single file setting many environment variables on csh solaris (never had any problems) and recently started deploying on tcsh linux.
With version tcsh 6.13 all is "normal". As of version 6.14 and up to latest 6.20, we see a very high CPU load just by sourcing the script (basically more than a full CPU used just by a one-liner heart beat script called randomly by twenty hearbeating processes). You can see some xload screenshots attached to compare.
We mitigated by removing calls from $HOME/.cshrc or using !#/bin/csh -f when possible, but we still see an overall high CPU load in all scripting (which is important).

Using the "perf" kernel profiler, I think I have narrowed down the culprit to gconv().

If I recompile tcsh 6.14 or 6.20 removing #ifdef HAVE_ICONV from config.h and #define WIDE_CHAR (and UTF_16 for 6.20) from config_f.h
then the CPU stays at nearly zero as expected even in "worst" conditions (even with sourcing in $HOME/.cshrc and without using -f ).

Of course I can not say if it is a gconv() bug, a bad configuration on all 5 machines (which perfectly possible), or the way it is called.

I think that gconv() is part of glibc, so here are the versions I have depending on the machine. I did not check them all with tcsh-620 + iconv deactivated.

libc-2.12.so (on two different RH like VMs)
libc-2.7.so (on one RH derivative hardware server)
libc-2.19.so (debian VM)
libc-2.24.so (ubuntu VM)

Direct compilation of a tcsh 6.20 with iconv enabled and crazy cpu: perf report -g graph -i xyz.perf.data

+ 22.86% heartbeat.csh ISO8859-15.so [.] gconv
+ 12.86% heartbeat.csh libc-2.12.so [.] __GI_rtwcomb
+ 12.38% heartbeat.csh [kernel.kallsyms] [k] 0xffffffff811807f9
    8.81% heartbeat.csh libfreebl3.so [.] 0x000000000000a8bd
+ 4.52% heartbeat.csh tcsh [.] one_wctomb
+ 4.05% heartbeat.csh tcsh [.] short2str
    3.33% prelink [kernel.kallsyms] [k] 0xffffffff811a9926
+ 3.33% heartbeat.csh tcsh [.] btell
    2.86% id [kernel.kallsyms] [k] 0xffffffff811585c2
+ 2.14% heartbeat.csh libc-2.12.so [.] wctomb

With iconv disabled, and no cpu problems:

+ 19.06% heartbeat.csh tcsh-620-noiconv [.] short2str
+ 13.38% heartbeat.csh [kernel.kallsyms] [k] 0xffffffff81297385
+ 11.37% heartbeat.csh libc-2.12.so [.] _int_malloc
+ 10.70% heartbeat.csh libfreebl3.so [.] 0x0000000000013887
+ 5.35% heartbeat.csh libc-2.12.so [.] memcpy

The file /usr/lib/locale/locale-archive on all machines was around 100Mb, shrinking it to 3Mb by keeping only en_ and fr_ locales makes no difference.

I have been able to reproduce on:
1) three physical servers, all of them Red Hat derivatives (RH6.5, Fedora Core, CentOS)
2) four VMs, including a Ubuntu and a Debian

Any help appreciated.
Sincerely,
John
Steps To ReproduceI wrote a few shell scripts simulating the heartbeating daemons. The file HOW-TO-REPRODUCE-LOAD-PB.TXT should explain how to set up, but basically, untar the scripts in your $HOME which will create a HL_TCSH/ directory/ then check that command :
source $HOME/HL_TCSH/logicals.csh
runs ok.
Add it to your $HOME/.cshrc in comment.
Launch "poc_launch_daemons.csh 20" to simulate 20 heartbeating processes and wait for load to stabilize.
Then uncomment the "source $HOME/HL_TCSH/logicals.csh" in your .cshrc and watch load go up after some 30 seconds.

You might have to unset LS_COLORS to test with tcsh lower than 6.15.
Additional Information1) the two load graphs compare the exact same script "poc_launch_daemons.csh 20" when changing ONLY the symbolic link /bin/tcsh from tcsh-6.13 binary to tcsh-6.18 or tcsh-6.20.

2) All variables in my env. Trying different combinations of LANG and/or LC_ALL did not make any difference.

USER=jgallet
LOGNAME=jgallet
HOME=/users/jgallet
PATH=/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/users/oracle/app/oracle/product/11.2.0/client_1//bin:/users/jgallet/SVN_CLI
MAIL=/var/spool/mail/jgallet
SHELL=/bin/csh
SSH_CLIENT=192.168.12.33 58840 22
SSH_CONNECTION=192.168.12.33 58840 192.168.4.83 22
SSH_TTY=/dev/pts/1
TERM=xterm
HOSTTYPE=x86_64-linux
VENDOR=unknown
OSTYPE=linux
MACHTYPE=x86_64
SHLVL=1
PWD=/users/jgallet
GROUP=tpm
HOST=RH66
REMOTEHOST=jgallet-pc.4tpm.grp
HOSTNAME=RH66
CVS_RSH=ssh
G_BROKEN_FILENAMES=1
LANG=en_US@euro
LESSOPEN=||/usr/bin/lesspipe.sh %s
MODULESHOME=/usr/share/Modules
MODULEPATH=/usr/share/Modules/modulefiles:/etc/modulefiles
LOADEDMODULES=
QTDIR=/usr/lib64/qt-3.3
QTINC=/usr/lib64/qt-3.3/include
QTLIB=/usr/lib64/qt-3.3/lib
ORACLE_HOME=/users/oracle/app/oracle/product/11.2.0/client_1/
LD_LIBRARY_PATH=/users/oracle/app/oracle/product/11.2.0/client_1//lib:
LC_ALL=en_US.iso885915
TWO_TASK=BGP
TagsNo tags attached.

Relationships

Activities

Issue History

Date Modified Username Field Change
2017-07-03 08:58 John GALLET New Issue
2017-07-03 08:58 John GALLET File Added: high-load-tcsh.tar.gz
2017-07-03 08:58 John GALLET File Added: tcsh-load-6.13-vs-6.18-debian.png
2017-07-03 08:58 John GALLET File Added: tcsh-load-6.13-vs-6.20-ubuntu.png