8Likes
 |
 |
Super Moderator
|
Join Date: Feb 2009
Location: In a galaxy far far away
Posts: 3,512
|
|

11-27-2012, 04:22 PM
[9.0044-33] Spikes of High CPU and memory
Hi All
I am using v9.0044-33 with UTM 120 rev 5. I am also using disk as part of the htp_local_db
It seems that the box is spiking on cpu and the memory is high .I have attached the relevant pic
As part of the troubleshooting I used perf top
Results
Code:
PerfTop: 1403 irqs/sec kernel:22.1% exact: 0.0% [1000Hz cycles], (all, 2 CPUs)
-------------------------------------------------------------------------------------------------------------------------------------
58.13% pngquant [.] 0x558a
4.60% libz.so.1.2.3 [.] longest_match
2.41% libpng12.so.0.31.0 [.] 0xf93c
2.21% [kernel] [k] read_hpet
1.40% [unknown] [.] 0x8050ed3
1.07% libz.so.1.2.3 [.] deflate_slow
0.97% libc-2.11.3.so [.] msort_with_tmp
0.93% libcairo.so.2.10800.8 [.] unpremultiply_data
0.68% perf [.] 0x1e000
0.59% libc-2.11.3.so [.] memcpy
0.58% libexpat.so.1.5.2 [.] 0xf9e6
0.56% libz.so.1.2.3 [.] fill_window
0.48% libfontconfig.so.1.3.0 [.] .L110
0.47% ld-2.11.3.so [.] do_lookup_x
0.39% libc-2.11.3.so [.] _int_malloc
0.36% ntpd [.] 0x1d013
0.36% libfreetype.so.6.3.18 [.] TT_RunIns
0.33% libfreetype.so.6.3.18 [.] 0x17a25
0.33% libfontconfig.so.1.3.0 [.] FcCompareFamily
0.32% libasapsdk.so [.] 0x20c35
0.29% libc-2.11.3.so [.] strstr
0.27% libz.so.1.2.3 [.] adler32
0.24% libperl.so [.] 0x101f1f
0.22% libc-2.11.3.so [.] __i686.get_pc_thunk.bx
0.21% postgres [.] 0x2b7d4c
0.20% libpixman-1.so.0.16.0 [.] pixman_rasterize_edges
0.20% libc-2.11.3.so [.] malloc
0.20% libc-2.11.3.so [.] strcmp
0.19% [kernel] [k] __kmalloc
0.19% libc-2.11.3.so [.] __memset_ia32
0.18% libz.so.1.2.3 [.] inflate_fast
0.18% [kernel] [k] unmap_vmas
0.17% [vdso] [.] 0x55576431
0.16% [kernel] [k] __schedule
0.16% [kernel] [k] page_fault
0.16% [kernel] [k] ia32_sysenter_target
0.16% libpthread.so.0 [.] 0xefbf
0.15% [kernel] [k] timerqueue_add
0.14% libc-2.11.3.so [.] _int_free
0.14% libpixman-1.so.0.16.0 [.] pixman_edge_init
0.14% [kernel] [k] _raw_spin_lock
0.14% [kernel] [k] hpet_legacy_next_event
0.14% libc-2.11.3.so [.] free
what is that process and why it's inroducing spikes to the box?
Code:
/root # top -b -n 1 | head -n 12 | tail -n 5
13298 root 39 19 3640 2080 632 R 90 0.1 0:00.73 pngquant
3729 root 20 0 11528 4788 1772 S 12 0.2 91:02.42 selfmonng.plx
4162 root 20 0 10140 6212 1668 S 9 0.3 23:28.03 dns-resolver.pl
13299 root 20 0 2808 1212 804 R 9 0.1 0:00.07 top
4686 root 20 0 32940 11m 3028 S 4 0.6 163:57.32 ctasd
Memory is normal at the moment:
Code:
/root # ps auxf | sort -nr -k 4 | head -5
810 28489 1.0 18.9 1205400 387520 ? Ssl 12:18 3:07 /var/chroot-http/usr/bin/httpproxy -f -c /var/chroot-http -u httpproxy
root 4324 1.8 12.3 339384 253364 ? Ssl 16:27 0:58 /usr/bin/cssd -d
snort 5520 0.3 5.5 385324 113612 ? S<sl 16:31 0:09 /sbin/snort_inline -Q -D -c /etc/snort/snort.conf -R 1 -K none -P 65535 -g 800 -u 800 --daq nfqmnlmmap --daq-var queue=0 --daq-var queue_len=1024 --daq-var verdictcache_len=25 --daq-var verdictcache_timeout=1000 --daq-var proto=ip*
wwwrun 6942 1.6 3.0 70492 63144 ? S 16:38 0:41 | \_ /var/webadmin/webadmin.plx
root 3691 3.7 3.0 77868 61700 ? S Nov24 180:19 /var/mdw/mdw.plx
Most of the swap is allocated to the HTTP proxy (Run top then press 'O' (capital letter o) followed by 'p' then 'enter'. Now processes should be sorted by their swap usage.)
Code:
top - 17:22:28 up 3 days, 8:16, 1 user, load average: 0.67, 1.22, 1.30
Tasks: 344 total, 1 running, 341 sleeping, 0 stopped, 2 zombie
Cpu(s): 2.4%us, 2.7%sy, 0.0%ni, 94.6%id, 0.3%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 2048484k total, 1666536k used, 381948k free, 3732k buffers
Swap: 1048572k total, 401552k used, 647020k free, 606832k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ SWAP COMMAND
28489 httpprox 20 0 1177m 377m 1312 S 0 18.9 0:36.18 799m httpproxy
28490 httpprox 20 0 1177m 377m 1312 S 0 18.9 0:07.72 799m httpproxy
28491 httpprox 20 0 1177m 377m 1312 S 0 18.9 0:00.74 799m httpproxy
28505 httpprox 20 0 1177m 377m 1312 S 0 18.9 0:00.14 799m httpproxy
28508 httpprox 20 0 1177m 377m 1312 S 0 18.9 0:00.75 799m httpproxy
28509 httpprox 20 0 1177m 377m 1312 S 0 18.9 0:00.83 799m httpproxy
28510 httpprox 20 0 1177m 377m 1312 S 0 18.9 0:00.50 799m httpproxy
<output omitted>
28511 httpprox 20 0 1177m 377m 1312 S 0 18.9 0:00.47 799m httpproxy
28512 httpprox 20 0 1177m 377m 1312 S 0 18.9 0:00.43 799m httpproxy
28513 httpprox 20 0 1177m 377m 1312 S 0 18.9 0:00.43 799m httpproxy
28514 httpprox 20 0 1177m 377m 1312 S 0 18.9 0:00.41 799m httpproxy
28515 httpprox 20 0 1177m 377m 1312 S 0 18.9 0:00.45 799m httpproxy
There are too many child-processes for httpproxy is this normal?
__________________
Astaro ASG 120 Rev. 4 |Version 9.0* |1 X Astaro wireless AP10
Sophos UTM 120 Rev. 5 |Version 9.1*** | 2 X Astaro wireless AP10
GPG Key Fingerprint: B205 ED94 FBF0 E617 63A5 C6C9 D15D 70A1 B84E 9AC9
Last edited by wingman; 11-27-2012 at 04:26 PM.
|
|
Moderator
|
Join Date: Jul 2001
Location: southern California
Posts: 8,528
|
|

11-27-2012, 05:43 PM
Hi,
pngquant is probably used to make the reporting graphs.
Re high memory, etc., you might also want to see
Extremely Slow Internet
Barry
__________________
http://JobOyster.com
http://DealBert.net
End-user since v1.x - ASL 8.3x, HP DL145, 1GB RAM, 6 gigE NICs, 50-IP Platinum License
- ASL 7.5x, HP DL360G5, Xeon 5160, 3GB, RAID, gigE NICs, 50-IP Platinum License
- ASL 9.0x, 17-watt fanless mini-ITX system: MSI IM-945GSE-A Atom n270, 2GB RAM, Morex T3310 case. 2 Intel GigE, 3 VLANs. 80G 5200rpm 2.5" HD.
Netgear GS108T gigE VLAN switch & Astaro AP30 Access Point with VLANs.
100-IP Home Power User License. 15/5mbit FiOS internet.
|
 |
Super Moderator
|
Join Date: Feb 2009
Location: In a galaxy far far away
Posts: 3,512
|
|

11-27-2012, 07:49 PM
I had local_http_db set to disk during the first post.I then set it to none and then set it back to disk.
Also I have changed the value of webadmin settings>advanced> WebAdmin idle timeout " from 6000 to 240 and checked "Log out on dashboard" option as it was suggested that there might be a memory leak
The attached picture shows significant drop on the swap/memory .Not sure if this is related but I thought it might help
__________________
Astaro ASG 120 Rev. 4 |Version 9.0* |1 X Astaro wireless AP10
Sophos UTM 120 Rev. 5 |Version 9.1*** | 2 X Astaro wireless AP10
GPG Key Fingerprint: B205 ED94 FBF0 E617 63A5 C6C9 D15D 70A1 B84E 9AC9
Last edited by wingman; 11-27-2012 at 08:01 PM.
|
 |
Grandis Professorem Astaro
|
Join Date: Mar 2007
Location: Oklahoma City
Posts: 14,299
|
|

11-27-2012, 08:01 PM
That's interesting, Wingman. Had you tried setting local_http_db to ram? I wonder if it's an "undocumented feature" that it must be set to none before you can set it to disk?
Cheers - Bob
__________________
ACE V7 - Astaro Preferred Partner since V3
Sophos Gold Solution Partner
www.MediaSoftUSA.com
Addicted to my iPhone!
|
 |
Super Moderator
|
Join Date: Feb 2009
Location: In a galaxy far far away
Posts: 3,512
|
|

11-27-2012, 08:04 PM
Quote:
Originally Posted by BAlfson
That's interesting, Wingman. Had you tried setting local_http_db to ram? I wonder if it's an "undocumented feature" that it must be set to none before you can set it to disk?
Cheers - Bob
|
not sure Bob. I still have 2G Ram and it's suggested to use disk for local db instead of Mem. Unfortunately , it's more a trial and error case so setting it to none and then disk seemed to do the trick for me. I will wait until the system settles down and post back with new finding later this week
__________________
Astaro ASG 120 Rev. 4 |Version 9.0* |1 X Astaro wireless AP10
Sophos UTM 120 Rev. 5 |Version 9.1*** | 2 X Astaro wireless AP10
GPG Key Fingerprint: B205 ED94 FBF0 E617 63A5 C6C9 D15D 70A1 B84E 9AC9
|
 |
Super Moderator
|
Join Date: Feb 2009
Location: In a galaxy far far away
Posts: 3,512
|
|

11-27-2012, 08:23 PM
Code:
****:/root # ps ax -o pcpu,cmd|sort -n -r|head
300 ps ax -o pcpu,cmd
102 /usr/local/bin/pngquant 128 /var/log/reporting/images.tmp/ctnl_weekly.png
9.0 [confd.plx] <defunct>
3.9 /usr/local/bin/create_rrd_graphs.plx
3.7 /var/mdw/mdw.plx
3.5 confd [worker:prpc:webadmin]
3.4 /usr/lib/ctasd/ctasd -p /var/run/ctasd.pid -l /usr/lib/ctasd
3.4 ./ctipd.bin -l /usr/lib/ctipd
3.3 confd [worker:prpc:webadmin]
2.7 /var/webadmin/webadmin.plx
Code:
/root # ps ax -o rss,cmd |sort -n -r|head
360100 /var/chroot-http/usr/bin/httpproxy -f -c /var/chroot-http -u httpproxy
282880 /usr/bin/cssd -d
121092 /sbin/snort_inline -Q -D -c /etc/snort/snort.conf -R 1 -K none -P 65535 -g 800 -u 800 --daq nfqmnlmmap --daq-var queue=0 --daq-var queue_len=1024 --daq-var verdictcache_len=25 --daq-var verdictcache_timeout=1000 --daq-var proto=ip*
60136 /var/webadmin/webadmin.plx
59232 /var/webadmin/webadmin.plx
57940 /var/mdw/mdw.plx
54776 /var/webadmin/webadmin.plx
27148 confd [worker:prpc:webadmin]
22440 confd [worker:prpc:webadmin]
21064 postgres: writer process
checking the confd-debug I can see the following (not sure what it is actually)
Code:
2012:11:27-16:26:58 ******** confd[3084]: D main::wakeup_subscribers:1130() => id="3100" severity="debug" sys="System" sub="confd" name="sending USR1 signal" client="confd [worker:prpc]" pid="6122"
2012:11:27-16:26:58 ******** confd[3084]: D main::wakeup_subscribers:1130() => id="3100" severity="debug" sys="System" sub="confd" name="sending USR1 signal" client="confd [worker:prpc]" pid="4979"
2012:11:27-16:26:58 ******** confd[3084]: D main::wakeup_subscribers:1130() => id="3100" severity="debug" sys="System" sub="confd" name="sending USR1 signal" client="smtpd [master]" pid="5070"
2012:11:27-16:26:58 ******** confd[3084]: D main::wakeup_subscribers:1130() => id="3100" severity="debug" sys="System" sub="confd" name="sending USR1 signal" client="screenmgr" pid="5297"
2012:11:27-16:26:58 ******** confd[3084]: D main::wakeup_subscribers:1130() => id="3100" severity="debug" sys="System" sub="confd" name="sending USR1 signal" client="dns-resolver.plx" pid="4162"
2012:11:27-16:26:58 ******** confd[3084]: D main::wakeup_subscribers:1130() => id="3100" severity="debug" sys="System" sub="confd" name="sending USR1 signal" client="websec-reporter.pl" pid="3323"
2012:11:27-16:26:58 ******** confd[3084]: D main::wakeup_subscribers:1130() => id="3100" severity="debug" sys="System" sub="confd" name="sending USR1 signal" client="mdw.plx" pid="3691"
2012:11:27-16:26:58 ******** confd[3084]: D main::wakeup_subscribers:1130() => id="3100" severity="debug" sys="System" sub="confd" name="sending USR1 signal" client="notifier.plx" pid="3476"
__________________
Astaro ASG 120 Rev. 4 |Version 9.0* |1 X Astaro wireless AP10
Sophos UTM 120 Rev. 5 |Version 9.1*** | 2 X Astaro wireless AP10
GPG Key Fingerprint: B205 ED94 FBF0 E617 63A5 C6C9 D15D 70A1 B84E 9AC9
|
|
Wizard
|
Join Date: May 2003
Location: Brunswick, Maryland, USA
Posts: 5,234
|
|

11-27-2012, 08:46 PM
the http proxy set to disk doesn't save that much because the database index is still stored in ram. The only two options that have a difference in ram usage is mem or none.
__________________
Owner: Emmanuel Technology Consulting
http://www.etc-md.com
Sophos Silver Solution Reseller
Sophos Certified Architect - UTM
Gillware Data Recovery Afilliate
MIcrosoft Registered Partner
|
|
Junior Member
|
Join Date: May 2012
Location: Columbus, Ohio, USA
Posts: 19
|
|

11-28-2012, 03:32 AM
Rebuilding the reporting (?) database fixed my CPU spikes that corresponded with reporting and raised weekly baseline for CPU on both a 8.3something and 9 system.
Code:
/etc/init.d/postgresql rebuild
|
 |
Super Moderator
|
Join Date: Feb 2009
Location: In a galaxy far far away
Posts: 3,512
|
|

11-28-2012, 06:42 AM
Quote:
|
Originally Posted by hopkins.103
Rebuilding the reporting (?) database fixed my CPU spikes that corresponded with reporting and raised weekly baseline for CPU on both a 8.3something and 9 system.
/etc/init.d/postgresql rebuild
|
I am not getting any database errors either in confd or confd-debug at the moment so not sure if rebuilding the db will fix anything
__________________
Astaro ASG 120 Rev. 4 |Version 9.0* |1 X Astaro wireless AP10
Sophos UTM 120 Rev. 5 |Version 9.1*** | 2 X Astaro wireless AP10
GPG Key Fingerprint: B205 ED94 FBF0 E617 63A5 C6C9 D15D 70A1 B84E 9AC9
Last edited by wingman; 11-28-2012 at 09:18 AM.
|
 |
Super Moderator
|
Join Date: Feb 2009
Location: In a galaxy far far away
Posts: 3,512
|
|

11-28-2012, 12:13 PM
Strangely enough, checking the logs today via winSCP I can see that everything is set to 0 as if it's not logging anymore.I get the same result from webadmin>logging >everything is set to 0
Latest entry on the confd log was 2012:11:27-16:28:03
Confd-debug has messages like
Code:
2012:11:27-16:25:47 ***** confd[3084]: D main::wakeup_subscribers:1130() => id="3100" severity="debug" sys="System" sub="confd" name="sending USR1 signal" client="screenmgr" pid="5297"
2012:11:27-16:25:47 ***** confd[3084]: D main::wakeup_subscribers:1130() => id="3100" severity="debug" sys="System" sub="confd" name="sending USR1 signal" client="dns-resolver.plx" pid="4162"
is this normal?
Also System log
Code:
2012:11:27-16:25:43 ***** syslog-ng[2671]: POLLERR occurred while idle; fd='139'
2012:11:27-16:25:44 ***** dns-resolver[4162]: Updating REF_heqygyURNf :: streamcloud.mediacdn.com
2012:11:27-16:25:47 ***** dns-resolver[4162]: No such IPv4 domain: imap-vdo.mx.aol.com (REF_SabkXLkTJK)
2012:11:27-16:25:47 ***** dns-resolver[4162]: No such IPv6 domain: imap-vdo.mx.aol.com (REF_SabkXLkTJK)
2012:11:27-16:25:47 ***** dns-resolver[4162]: Updating REF_SabkXLkTJK :: imap-vdo.mx.aol.com
2012:11:27-16:26:01 ***** /usr/sbin/cron[3385]: (root) CMD (/sbin/audld.plx --nosys --trigger)
2012:11:27-16:26:57 ***** dns-resolver[4162]: Updating REF_EwWPLqlCKU :: smtp.live.com
2012:11:27-16:28:02 ***** dns-resolver[4162]: Updating REF_heqygyURNf :: streamcloud.mediacdn.com
2012:11:27-16:28:04 ***** dns-resolver[4162]: No change to REF_EwWPLqlCKU :: smtp.live.com
2012:11:27-16:28:17 ***** syslog-ng[2671]: Termination requested via signal, terminating;
2012:11:27-16:28:17 ***** syslog-ng[2671]: syslog-ng shutting down; version='3.0.10'
Nothing in fallback.log
Code:
loginuser@*****:/var/mdw/scripts > ps -ef | grep postgres
postgres 3604 1 0 Nov24 ? 00:00:42 /usr/pgsql84/bin/postgres -D /var/storage/pgsql/data
postgres 3627 3604 0 Nov24 ? 00:02:18 postgres: writer process
postgres 3628 3604 0 Nov24 ? 00:03:02 postgres: wal writer process
postgres 3629 3604 0 Nov24 ? 00:01:40 postgres: autovacuum launcher process
postgres 3630 3604 0 Nov24 ? 00:03:34 postgres: stats collector process
postgres 4332 3604 0 Nov24 ? 00:00:59 postgres: postgres hotspot 127.0.0.1(43253) idle
postgres 4508 3604 0 Nov24 ? 00:34:07 postgres: reporting reporting [local] idle
postgres 4552 3604 0 Nov24 ? 00:00:44 postgres: postgres pop3 127.0.0.1(43269) idle
postgres 5309 3604 0 Nov24 ? 00:00:00 postgres: postgres smtp 127.0.0.1(43285) idle
postgres 13780 3604 0 07:13 ? 00:00:02 postgres: epp epp 127.0.0.1(34438) idle
postgres 14854 3604 0 14:02 ? 00:00:00 postgres: postgres pop3 127.0.0.1(37309) idle
100 14896 17906 0 14:03 pts/1 00:00:00 grep postgres
postgres 17133 3604 0 Nov24 ? 00:00:00 postgres: postgres pop3 127.0.0.1(45728) idle
postgres 18497 3604 0 Nov25 ? 00:00:00 postgres: postgres hotspot 127.0.0.1(35165) idle
postgres 20879 3604 0 Nov24 ? 00:00:00 postgres: postgres smtp 127.0.0.1(45960) idle
postgres 20882 3604 0 Nov24 ? 00:00:00 postgres: postgres pop3 127.0.0.1(45961) idle
postgres 29745 3604 0 09:00 ? 00:01:15 postgres: postgres smtp 127.0.0.1(35131) idle
postgres 32537 3604 0 Nov24 ? 00:00:00 postgres: postgres hotspot 127.0.0.1(44758) idle
postgres 32681 3604 0 Nov24 ? 00:00:00 postgres: postgres hotspot 127.0.0.1(44778) idle
Any ideas ?
__________________
Astaro ASG 120 Rev. 4 |Version 9.0* |1 X Astaro wireless AP10
Sophos UTM 120 Rev. 5 |Version 9.1*** | 2 X Astaro wireless AP10
GPG Key Fingerprint: B205 ED94 FBF0 E617 63A5 C6C9 D15D 70A1 B84E 9AC9
Last edited by wingman; 11-28-2012 at 01:04 PM.
|
| Thread Tools |
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
All times are GMT. The time now is 01:32 AM.
|