High CPU usage issue / leap second?
Added by Jan Niggemann (redmine.org team member) over 12 years ago
Since about 3AM CEST this morning I'm seeing strage CPU usage patterns:
Using 'top' I saw that they were caused by ruby1.8...
pstree -p
init(1)─┬─acpid(923) ├─apache2(975)─┬─ApplicationPool(1106)─┬─ruby1.8(1107)─┬─ruby1.8(1441) │ │ │ └─{ruby1.8}(1442) │ │ ├─{ApplicationPoo}(1108) │ │ ├─{ApplicationPoo}(1109) │ │ └─{ApplicationPoo}(1440) │ ├─apache2(1095)
strace -f -p 1107
1442 clock_gettime(CLOCK_REALTIME, {1341229120, 830142021}) = 0 1442 futex(0xb76d1370, FUTEX_WAKE_PRIVATE, 1) = 0 1442 clock_gettime(CLOCK_REALTIME, {1341229120, 830184171}) = 0 1442 futex(0xb76d1344, FUTEX_WAIT_PRIVATE, 640959, {0, 9957850}) = -1 ETIMEDOUT (Connection timed out) 1442 clock_gettime(CLOCK_REALTIME, {1341229120, 840326708}) = 0 1442 futex(0xb76d1370, FUTEX_WAKE_PRIVATE, 1) = 0 1442 clock_gettime(CLOCK_REALTIME, {1341229120, 840365869}) = 0 1442 futex(0xb76d1344, FUTEX_WAIT_PRIVATE, 640961, {0, 9960839}) = -1 ETIMEDOUT (Connection timed out) 1442 clock_gettime(CLOCK_REALTIME, {1341229120, 850523168}) = 0 1442 futex(0xb76d1370, FUTEX_WAKE_PRIVATE, 1) = 0 1442 clock_gettime(CLOCK_REALTIME, {1341229120, 850562991}) = 0 1442 futex(0xb76d1344, FUTEX_WAIT_PRIVATE, 640963, {0, 9960177}) = -1 ETIMEDOUT (Connection timed out)
Rebooting mitigated the problem, the (virtual) machine is calm now and running like last week (and the time before). But I'm wondering if this has something to do with the 2012 leap second...
Any ideas?
For reference, I'm on Debian squeeze:
uname -a: Linux MyMachine 2.6.32-5-686 #1 SMP Sun May 6 04:01:19 UTC 2012 i686 GNU/Linux
cat /etc/debian_version: 6.0.5
ruby -v: ruby 1.8.7 (2010-08-16 patchlevel 302) [i486-linux]
rails -v: Rails 2.3.14
apache2 -v: Server version: Apache/2.2.16 (Debian)
passenger
redmine 1.4.3
php -v:
PHP 5.3.3-7+squeeze13 with Suhosin-Patch (cli) (built: Jun 10 2012 09:35:18) Copyright (c) 1997-2009 The PHP Group Zend Engine v2.3.0, Copyright (c) 1998-2010 Zend Technologies with Suhosin v0.9.32.1, Copyright (c) 2007-2010, by SektionEins GmbH
RAILS_ENV="production" script/about
About your application's environment Ruby version 1.8.7 (i486-linux) RubyGems version 1.3.7 Rack version 1.1.3 Rails version 2.3.14 Active Record version 2.3.14 Active Resource version 2.3.14 Action Mailer version 2.3.14 Active Support version 2.3.14 Application root /opt/redmine Environment production Database adapter mysql Database schema version 20120301153455
week.png (19.5 KB) week.png | week | ||
detail.png (23.7 KB) detail.png | detail | ||
day.png (27.1 KB) day.png | day |
Replies (7)
RE: High CPU usage issue / leap second? - Added by Moritz Schepp over 12 years ago
Yes I discovered we had the same problem on every server running passenger. Restarting Apache, recompiling passenger, updating passenger all didn't help. I restarted one of the machines and it's all back to normal again.
What also seems to fix the issue is to just set the date:
/etc/init.d/chrony stop date -s "`date`" /etc/init.d/chrony start
You don't even have to restart Apache afterwards. So I guess the leap second could be the cause after all. Also, I we have KVM virtual machines here only, perhaps the problem is additionally specific to this scenario
RE: High CPU usage issue / leap second? - Added by Sergey Belov over 12 years ago
Experienced same 100% CPU usage issue and this helped
date -s "`date`"
It looks like issue in linux kernel and i had problem with MySQL...
RE: High CPU usage issue / leap second? - Added by Romain F over 12 years ago
I confirm same over CPU usage. Setting the date solved the issue.
Other services encountered bugs such as Mozilla, and some social networks. Most of them are running Java.
RE: High CPU usage issue / leap second? - Added by Jan Niggemann (redmine.org team member) over 12 years ago
Now... what was causing this? Was it the kernel or ruby or something else?
Even after rebooting and setting the date manually, I still get this when stracing ruby 1.8 -f:
[pid 7021] clock_gettime(CLOCK_REALTIME, {1341305908, 231893109}) = 0 [pid 7021] futex(0xb778e344, FUTEX_WAIT_PRIVATE, 12714723, {0, 9945666}) = -1 ETIMEDOUT (Connection timed out) [pid 7021] clock_gettime(CLOCK_REALTIME, {1341305908, 242006205}) = 0 [pid 7021] futex(0xb778e370, FUTEX_WAKE_PRIVATE, 1) = 0 [pid 7021] clock_gettime(CLOCK_REALTIME, {1341305908, 242038638}) = 0 [pid 7021] futex(0xb778e344, FUTEX_WAIT_PRIVATE, 12714725, {0, 9967567}) = -1 ETIMEDOUT (Connection timed out) [pid 7021] clock_gettime(CLOCK_REALTIME, {1341305908, 252168164}) = 0 [pid 7021] futex(0xb778e370, FUTEX_WAKE_PRIVATE, 1) = 0 [pid 7021] clock_gettime(CLOCK_REALTIME, {1341305908, 252205028}) = 0 [pid 7021] futex(0xb778e344, FUTEX_WAIT_PRIVATE, 12714727, {0, 9963136}) = -1 ETIMEDOUT (Connection timed out)
RE: High CPU usage issue / leap second? - Added by Pramod kumbhar over 12 years ago
experienced same issue, setting date fixed this. For more information:
https://lkml.org/lkml/2012/7/1/176
https://lkml.org/lkml/2012/7/1/203
http://artipc10.vub.ac.be/wordpress/2012/07/01/leap-second-causing-ksoftirqd-and-java-to-use-lots-of-cpu-time/
RE: High CPU usage issue / leap second? - Added by Gurvan Le Dromaguet over 12 years ago
I have same futex TimeOut even after applying the workaround. But no more 100% CPU usage problems ... I am lost.
So I have no functionnal problem keeping things like this. Is it safe ? Do I have another issue that could be leading to same messages ?
RE: High CPU usage issue / leap second? - Added by Jan Niggemann (redmine.org team member) over 12 years ago
Same thing here, even after the problem's gone, I still have those futex ETIMEDTOUT messages, too. But I'm not sure if they are a problem or not...
I'll point to this thread in the dev board, maybe they have an idea.