b***@apache.org
2018-11-12 08:33:24 UTC
https://bz.apache.org/bugzilla/show_bug.cgi?id=62903
Bug ID: 62903
Summary: Better handling of a thread limit forced by cgroups
via systemd
Product: Apache httpd-2
Version: 2.4.29
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P2
Component: Core
Assignee: ***@httpd.apache.org
Reporter: ***@maxcluster.de
Target Milestone: ---
Hi everyone,
this is mainly a copy of my posting on the httpd-users mailing list[1] because
I think there is a bug somewhere deep down:
I'm stumbled upon some strange issues with the apache2. At first some facts
about the environment there the apache2 was running:
* Virtuozzo 7 with a stripped down version of Ubuntu 18.04 as container OS.
* The thread/processes limit is set to 1500 for the container.
* The apache2 is using the `mpm_worker_module` with this configuration:
<IfModule mpm_worker_module>
ServerLimit 16
StartServers 3
MinSpareThreads 75
MaxSpareThreads 250
ThreadLimit 64
ThreadsPerChild 25
MaxRequestWorkers 400
MaxConnectionsPerChild 10000
</IfModule>
* I'm using apache2-2.4.29-1ubuntu4.4.
* I was dealing with some kind of benchmarking and I used this command `ab -kc
1000 -t 60 http://foo.invalid/` for it. The returned page was a simple static
HTML page with not FastCGI/PHP or any other "fancy" stuff.
I was expecting some delay with the benchmarks because the number of
connections `1000` were larger than the upper thread limit of the
`mpm_worker_module`. This should not a problem at all because apache2 shows
some good error messages if it runs into some process/thread limit.
But in this case, this is what I found in the `error.log` (removed the
timestamp for readability):
… [mpm_worker:alert] [pid 2835:tid 140106655919040] (11)Resource temporarily
unavailable: AH00282: apr_thread_create: unable to create worker thread
… [mpm_worker:alert] [pid 2898:tid 140106655919040] (11)Resource temporarily
unavailable: AH00282: apr_thread_create: unable to create worker thread
… [mpm_worker:alert] [pid 2938:tid 140106655919040] (11)Resource temporarily
unavailable: AH00282: apr_thread_create: unable to create worker thread
Or
… [mpm_worker:crit] [pid 25242:tid 140005187725056] (22)Invalid argument:
AH03139: ap_queue_pop failed
… [mpm_worker:crit] [pid 25242:tid 140005187725056] (22)Invalid argument:
AH03139: ap_queue_pop failed
… [mpm_worker:crit] [pid 25242:tid 140005187725056] (22)Invalid argument:
AH03139: ap_queue_pop failed
… [mpm_worker:alert] [pid 25396:tid 140005187929856] (11)Resource temporarily
unavailable: AH03142: apr_thread_create: unable to create worker thread
Or
… [core:error] [pid 401:tid 140005310155712] AH00546: no record of generation 0
of exiting child 25241
… [core:error] [pid 401:tid 140005310155712] AH00546: no record of generation 0
of exiting child 25396
And no other messages.
Sometimes the apache2 hangs with 100% CPU usage even after the ab testing was
The load was going up to 20 even without any further requests. I've to restart
the apache2 to fix this.
`strace` during the ab run shows this:
2835 clone(child_stack=0x7f6d09becf70,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,parent_tidptr=0x7f6d09bed9d0,
tls=0x7f6d09bed700, child_tidptr=0x7f6d09bed9d0) = -1 EAGAIN (Resource
temporarily unavailable)
2835 clone(child_stack=0x7f6d09becf70,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,parent_tidptr=0x7f6d09bed9d0,
tls=0x7f6d09bed700, child_tidptr=0x7f6d09bed9d0) = -1 EAGAIN (Resource
temporarily unavailable)
2835 clone(child_stack=0x7f6d09becf70,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,parent_tidptr=0x7f6d09bed9d0,
tls=0x7f6d09bed700, child_tidptr=0x7f6d09bed9d0) = -1 EAGAIN (Resource
temporarily unavailable)
So someone was holding back the apache2 from creating new threads. To make a
long story short: It was systemd that limited the total processes of the
apache2 cgroup to 225 threads:
$ systemctl status apache2 | grep Tasks
Tasks: 194 (limit: 225)
Setting the systemd value `DefaultTasksMax=infinity` and everything was fine
but 2 questions are still remaining:
1. The error message shows that apache2 is not ready to handle successfully a
fail of the clone() call. I think this is a bug.
2. Can apache2 detect a task limit via cgroups on it's own to detect this kind
of misconfiguration? In this case the max configured threads were larger than
the limit of the cgroup. Is it possible to detect this and issue a warning to
the logfiles? In other cases of limits like these the apache2 is more helpful
with the given error messages.
[1]http://mail-archives.apache.org/mod_mbox/httpd-users/201810.mbox/%3C544ce613-3f9b-5f0f-c52e-2946af4396e9%40maxcluster.de%3E
Bug ID: 62903
Summary: Better handling of a thread limit forced by cgroups
via systemd
Product: Apache httpd-2
Version: 2.4.29
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P2
Component: Core
Assignee: ***@httpd.apache.org
Reporter: ***@maxcluster.de
Target Milestone: ---
Hi everyone,
this is mainly a copy of my posting on the httpd-users mailing list[1] because
I think there is a bug somewhere deep down:
I'm stumbled upon some strange issues with the apache2. At first some facts
about the environment there the apache2 was running:
* Virtuozzo 7 with a stripped down version of Ubuntu 18.04 as container OS.
* The thread/processes limit is set to 1500 for the container.
* The apache2 is using the `mpm_worker_module` with this configuration:
<IfModule mpm_worker_module>
ServerLimit 16
StartServers 3
MinSpareThreads 75
MaxSpareThreads 250
ThreadLimit 64
ThreadsPerChild 25
MaxRequestWorkers 400
MaxConnectionsPerChild 10000
</IfModule>
* I'm using apache2-2.4.29-1ubuntu4.4.
* I was dealing with some kind of benchmarking and I used this command `ab -kc
1000 -t 60 http://foo.invalid/` for it. The returned page was a simple static
HTML page with not FastCGI/PHP or any other "fancy" stuff.
I was expecting some delay with the benchmarks because the number of
connections `1000` were larger than the upper thread limit of the
`mpm_worker_module`. This should not a problem at all because apache2 shows
some good error messages if it runs into some process/thread limit.
But in this case, this is what I found in the `error.log` (removed the
timestamp for readability):
… [mpm_worker:alert] [pid 2835:tid 140106655919040] (11)Resource temporarily
unavailable: AH00282: apr_thread_create: unable to create worker thread
… [mpm_worker:alert] [pid 2898:tid 140106655919040] (11)Resource temporarily
unavailable: AH00282: apr_thread_create: unable to create worker thread
… [mpm_worker:alert] [pid 2938:tid 140106655919040] (11)Resource temporarily
unavailable: AH00282: apr_thread_create: unable to create worker thread
Or
… [mpm_worker:crit] [pid 25242:tid 140005187725056] (22)Invalid argument:
AH03139: ap_queue_pop failed
… [mpm_worker:crit] [pid 25242:tid 140005187725056] (22)Invalid argument:
AH03139: ap_queue_pop failed
… [mpm_worker:crit] [pid 25242:tid 140005187725056] (22)Invalid argument:
AH03139: ap_queue_pop failed
… [mpm_worker:alert] [pid 25396:tid 140005187929856] (11)Resource temporarily
unavailable: AH03142: apr_thread_create: unable to create worker thread
Or
… [core:error] [pid 401:tid 140005310155712] AH00546: no record of generation 0
of exiting child 25241
… [core:error] [pid 401:tid 140005310155712] AH00546: no record of generation 0
of exiting child 25396
And no other messages.
Sometimes the apache2 hangs with 100% CPU usage even after the ab testing was
5875 www-data 20 0 372M 3340 1068 S 198. 0.1 0:31.16
/usr/sbin/apache2 -k startThe load was going up to 20 even without any further requests. I've to restart
the apache2 to fix this.
`strace` during the ab run shows this:
2835 clone(child_stack=0x7f6d09becf70,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,parent_tidptr=0x7f6d09bed9d0,
tls=0x7f6d09bed700, child_tidptr=0x7f6d09bed9d0) = -1 EAGAIN (Resource
temporarily unavailable)
2835 clone(child_stack=0x7f6d09becf70,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,parent_tidptr=0x7f6d09bed9d0,
tls=0x7f6d09bed700, child_tidptr=0x7f6d09bed9d0) = -1 EAGAIN (Resource
temporarily unavailable)
2835 clone(child_stack=0x7f6d09becf70,
flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID,parent_tidptr=0x7f6d09bed9d0,
tls=0x7f6d09bed700, child_tidptr=0x7f6d09bed9d0) = -1 EAGAIN (Resource
temporarily unavailable)
So someone was holding back the apache2 from creating new threads. To make a
long story short: It was systemd that limited the total processes of the
apache2 cgroup to 225 threads:
$ systemctl status apache2 | grep Tasks
Tasks: 194 (limit: 225)
Setting the systemd value `DefaultTasksMax=infinity` and everything was fine
but 2 questions are still remaining:
1. The error message shows that apache2 is not ready to handle successfully a
fail of the clone() call. I think this is a bug.
2. Can apache2 detect a task limit via cgroups on it's own to detect this kind
of misconfiguration? In this case the max configured threads were larger than
the limit of the cgroup. Is it possible to detect this and issue a warning to
the logfiles? In other cases of limits like these the apache2 is more helpful
with the given error messages.
[1]http://mail-archives.apache.org/mod_mbox/httpd-users/201810.mbox/%3C544ce613-3f9b-5f0f-c52e-2946af4396e9%40maxcluster.de%3E
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-***@httpd.apache.org
For additional commands, e-mail: bugs-***@httpd.apache.org
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-***@httpd.apache.org
For additional commands, e-mail: bugs-***@httpd.apache.org