We have been trialing Wekan at a large organisation for nearly a year, starting with Libreboard and then moving on to Wekan. Since the move to Wekan we’ve had major CPU utilization issues by the node process that results in it failing.
Configuration is as follows (all servers running Red Hat 7.3):
- 3 virtual servers (2x CPU, 4GB RAM), each hosting a single Wekan instance (currently v0.41) running on Node v4.8.4
- 3 virtual servers (2x CPU, 4GB RAM), each hosting a MongoDB instance, configured in a ReplicaSet
- 1 virtual server (2x CPU, 4GB RAM) running Nginx as a load balancer
Our usage figures are as follows:
Users: 2785 Boards: 3564 Lists: 10418 Cards: 21361
The problem we’re encountering is that as the number of concurrent users increases (anything above 50), the Node CPU usage figures for the node instances increases to between 90-100% until the process fails to be able to respond to requests through the web browser. It’s at this point that I need to either restart the Node process or it fails and is restarted by systemd.
For information the memory usage on the Node processes is low (<10%)
For information the CPU usage on the primary MongoDB server is 20-30%.
I’ve tried to debug the cause of this but have so far been unable to find a root cause. My suspicion is that there’s a costly process attached to each of the connected clients that ramps up the CPU usage but I don’t know enough about Meteor/Node/MongoDB to debug it. Can anyone suggest any routes to investigate?