Project

General

Profile

High CPU usage issue / leap second?

Added by Jan Niggemann (redmine.org team member) over 11 years ago

Since about 3AM CEST this morning I'm seeing strage CPU usage patterns:
week
detail
day

Using 'top' I saw that they were caused by ruby1.8...
pstree -p

init(1)─┬─acpid(923)
        ├─apache2(975)─┬─ApplicationPool(1106)─┬─ruby1.8(1107)─┬─ruby1.8(1441)
        │              │                       │               └─{ruby1.8}(1442)
        │              │                       ├─{ApplicationPoo}(1108)
        │              │                       ├─{ApplicationPoo}(1109)
        │              │                       └─{ApplicationPoo}(1440)
        │              ├─apache2(1095)

strace -f -p 1107

1442  clock_gettime(CLOCK_REALTIME, {1341229120, 830142021}) = 0
1442  futex(0xb76d1370, FUTEX_WAKE_PRIVATE, 1) = 0
1442  clock_gettime(CLOCK_REALTIME, {1341229120, 830184171}) = 0
1442  futex(0xb76d1344, FUTEX_WAIT_PRIVATE, 640959, {0, 9957850}) = -1 ETIMEDOUT (Connection timed out)
1442  clock_gettime(CLOCK_REALTIME, {1341229120, 840326708}) = 0
1442  futex(0xb76d1370, FUTEX_WAKE_PRIVATE, 1) = 0
1442  clock_gettime(CLOCK_REALTIME, {1341229120, 840365869}) = 0
1442  futex(0xb76d1344, FUTEX_WAIT_PRIVATE, 640961, {0, 9960839}) = -1 ETIMEDOUT (Connection timed out)
1442  clock_gettime(CLOCK_REALTIME, {1341229120, 850523168}) = 0
1442  futex(0xb76d1370, FUTEX_WAKE_PRIVATE, 1) = 0
1442  clock_gettime(CLOCK_REALTIME, {1341229120, 850562991}) = 0
1442  futex(0xb76d1344, FUTEX_WAIT_PRIVATE, 640963, {0, 9960177}) = -1 ETIMEDOUT (Connection timed out)

Rebooting mitigated the problem, the (virtual) machine is calm now and running like last week (and the time before). But I'm wondering if this has something to do with the 2012 leap second...
Any ideas?

For reference, I'm on Debian squeeze:
uname -a: Linux MyMachine 2.6.32-5-686 #1 SMP Sun May 6 04:01:19 UTC 2012 i686 GNU/Linux
cat /etc/debian_version: 6.0.5
ruby -v: ruby 1.8.7 (2010-08-16 patchlevel 302) [i486-linux]
rails -v: Rails 2.3.14
apache2 -v: Server version: Apache/2.2.16 (Debian)
passenger
redmine 1.4.3
php -v:

     PHP 5.3.3-7+squeeze13 with Suhosin-Patch (cli) (built: Jun 10 2012 09:35:18)
     Copyright (c) 1997-2009 The PHP Group
     Zend Engine v2.3.0, Copyright (c) 1998-2010 Zend Technologies
         with Suhosin v0.9.32.1, Copyright (c) 2007-2010, by SektionEins GmbH

RAILS_ENV="production" script/about

About your application's environment
Ruby version              1.8.7 (i486-linux)
RubyGems version          1.3.7
Rack version              1.1.3
Rails version             2.3.14
Active Record version     2.3.14
Active Resource version   2.3.14
Action Mailer version     2.3.14
Active Support version    2.3.14
Application root          /opt/redmine
Environment               production
Database adapter          mysql
Database schema version   20120301153455

week.png (19.5 KB) week.png week
detail.png (23.7 KB) detail.png detail
day.png (27.1 KB) day.png day

Replies (7)

RE: High CPU usage issue / leap second? - Added by Moritz Schepp over 11 years ago

Yes I discovered we had the same problem on every server running passenger. Restarting Apache, recompiling passenger, updating passenger all didn't help. I restarted one of the machines and it's all back to normal again.

What also seems to fix the issue is to just set the date:

/etc/init.d/chrony stop
date -s "`date`" 
/etc/init.d/chrony start

You don't even have to restart Apache afterwards. So I guess the leap second could be the cause after all. Also, I we have KVM virtual machines here only, perhaps the problem is additionally specific to this scenario

RE: High CPU usage issue / leap second? - Added by Sergey Belov over 11 years ago

Experienced same 100% CPU usage issue and this helped

date -s "`date`" 

It looks like issue in linux kernel and i had problem with MySQL...

RE: High CPU usage issue / leap second? - Added by Romain F over 11 years ago

I confirm same over CPU usage. Setting the date solved the issue.

Other services encountered bugs such as Mozilla, and some social networks. Most of them are running Java.

RE: High CPU usage issue / leap second? - Added by Jan Niggemann (redmine.org team member) over 11 years ago

Now... what was causing this? Was it the kernel or ruby or something else?
Even after rebooting and setting the date manually, I still get this when stracing ruby 1.8 -f:

[pid  7021] clock_gettime(CLOCK_REALTIME, {1341305908, 231893109}) = 0
[pid  7021] futex(0xb778e344, FUTEX_WAIT_PRIVATE, 12714723, {0, 9945666}) = -1 ETIMEDOUT (Connection timed out)
[pid  7021] clock_gettime(CLOCK_REALTIME, {1341305908, 242006205}) = 0
[pid  7021] futex(0xb778e370, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  7021] clock_gettime(CLOCK_REALTIME, {1341305908, 242038638}) = 0
[pid  7021] futex(0xb778e344, FUTEX_WAIT_PRIVATE, 12714725, {0, 9967567}) = -1 ETIMEDOUT (Connection timed out)
[pid  7021] clock_gettime(CLOCK_REALTIME, {1341305908, 252168164}) = 0
[pid  7021] futex(0xb778e370, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  7021] clock_gettime(CLOCK_REALTIME, {1341305908, 252205028}) = 0
[pid  7021] futex(0xb778e344, FUTEX_WAIT_PRIVATE, 12714727, {0, 9963136}) = -1 ETIMEDOUT (Connection timed out)

RE: High CPU usage issue / leap second? - Added by Gurvan Le Dromaguet over 11 years ago

I have same futex TimeOut even after applying the workaround. But no more 100% CPU usage problems ... I am lost.
So I have no functionnal problem keeping things like this. Is it safe ? Do I have another issue that could be leading to same messages ?

RE: High CPU usage issue / leap second? - Added by Jan Niggemann (redmine.org team member) over 11 years ago

Same thing here, even after the problem's gone, I still have those futex ETIMEDTOUT messages, too. But I'm not sure if they are a problem or not...

I'll point to this thread in the dev board, maybe they have an idea.

    (1-7/7)