We’ve resolved the PM2 issues around cluster-mode
for Node v16. Here are the release notes.
As we’ve been continuing our efforts to harden Cleavr, some reports started to come in about Strapi and Nuxt apps not working properly when utilizing cluster-mode
on Node v16. We’ve tackled that issue and added some new along the way!
Strapi
- Split Strapi out to Strapi 3 and Strapi 4 app-types to provide improved support by version
- Resolved
cluster-mode
issue on Node v16 - Resolved persistent file storage issue during deployments
For existing Strapi apps
Unfortunately, we were not able to make the cluster-mode
improvements backwards compatible for existing apps. But…! You can make some quick updates to take advantage of the new improvements.
Strapi v3 apps
Add the following file and it’s contents to the root of your project and push to your code repo:
.cleavr.runner.js
Add the following contents to the file:
const strapi = require("strapi");
strapi().start();
After pushing the above file to your code repo, see PM2 Ecosystem Updates section below to complete setup.
Strapi v4 apps
Add the following file and it’s contents to the root of your project and push to your code repo:
.cleavr.runner.js
Add the following contents to the file:
const strapi = require("@strapi/strapi");
strapi().start();
After pushing the above file to your code repo, see PM2 Ecosystem Updates section below to complete setup.
Nuxt
- Split Nuxt SSR out to Nuxt SSR 2 and Nuxt SSR 3 app-types
- Nitro server-engine support for Nuxt SSR 3
- Resolved
cluster-mode
issue on Node v16
For existing Nuxt apps
See PM2 Ecosystem Updates section below if you’d like to take advantage of new updates for cluster-mode
with Node v16.
Directus
- 1-click install!
- Resolved
cluster-mode
issue on Node v16 - Resolved persistent file storage issue during deployments
Now when you add a new Directus site, you can enter admin login credentials and install the Directus bootstrap with just 1-click! Need to deploy your Directus site from your code repository? No problem! A web app is still created when adding a new Directus site, so you can deploy from your code repo just as easy.
For existing Directus apps
See PM2 Ecosystem Updates section below if you’d like to take advantage of new updates for cluster-mode
with Node v16.
PM2 Ecosystem Updates
Navigate to your web app > settings > build tab, and update script to ".cleavr.runner.js"
and args to “”. Except for Nuxt SSR 2 apps, you’ll need to add start
for args.
Also, make sure instances is set to max
and exec_mode is set to cluster_mode
.
After making the changes, deploy your project for the changes to take effect.
Note for monorepos
If you have a monorepo setup, we recommend you do not make these changes quite yet. We’ll be working on a solution to better handle monorepos.
Oh, no!
I’m running Quasar and i now also get
had too many unstable restarts (16). Stopped. "errored"
Please adwise, my customer is waiting
Hello @peterc,
I think there is another topic that is related to the issue that you’re facing. Please refer to this thread
Do let us know whether that works for you or not.
I was going to call my customer in 5 minutes, you save me from a lot of trouble @anish
Thank you so much!
Hello, I’m running NodeV16 with StrapiV4 at Vultr 2 CPU and added the .cleavr.runner.js file at the root. Then I deployed the app but the server still has a 100% CPU utilization.
I’m running a trial to test Cleavr. Any idea how to fix this?
Thanks
Hello @BureauBerg,
First of all welcome to Cleavr forum.
We’ll look into the issue but in the meantime, can you please check if the site is throwing any errors like 502 error? You can also check PM2 log from the deployments page by clicking on Load PM2 Logs button or by going to Server > Logs. We’ve also noticed that CPU utilization reach maximum while there are certain errors at the app level.
You can also view NodeJS logs from the services section and resolve issues if there are any.
You can follow these links to troubleshoot 502 errors for NodeJS based applications:
Hi Anish, thanks for your reply.
It indeed throws a 502 error so I checked the PM2 log which shows:
PM2 log: App name: xyz disconnected
PM2 log: App [xyz] exited with code [1] via signal [SIGINT]
PM2 log: App [xyz] starting in -cluster mode-
PM2 log: App [xyz] online
and the Nginx log is as follows:
The “connect() failed (111: Connection refused) while connecting to upstream, client”
I’m a newbie but it looks to me that Nginx doesn’t have access permission to the app. And some folder settings that might be not correct yet?
I created a new system user and added the app to that user when setting it up in Cleavr and I thought these permission settings would be added automatically.
Could anyone shed light on this?
Thanks a lot!
Jacco
Hello @BureauBerg,
Two possible case for your issue:
It looks like you’re building your app on GitHub. If your project requires database connection during build you need to provide database credentials in PM2 Ecosystem. To do so go to Webapp > Settings > Build > PM2 Ecosystem and on the env section add you credentials. It may look something like this:
env: {
"PORT": 3333,
"CI": 1,
"NUXT_TELEMETRY_DISABLED": 1,
"DATABASE_HOST": "localhost",
"DATABASE_PORT": "3306",
"DATABASE_NAME": "database_name",
"DATABASE_USERNAME": "database_username",
"DATABASE_PASSWORD": "database_password"
}
Another one, that you’ve not updated your environment variables yet. Strapi requires some secrets to run such as APP_KEY
. If you’ve not updated your environment variables from Webapp > Environment make sure to check your .env
file on the local project and update them accordingly.
Make sure to re-deploy after performing the above steps mentioned above.
I hope it helps. Let us know if that doesn’t resolve your issue.
I’m running into this issue with a remix app—is there a tutorial somewhere to help me set this up properly with remix instead of nuxt?
Thanks. I followed that tutorial to set up my site initially—it’s running in cluster mode, and running with high CPU because the site keeps stopping and restarting with the Error: listen EADDRINUSE: address already in use
error.
For this error, you can try to restart the app in the deployment workflow > app > app status section. If restarting from there doesn’t work, you can try doing a harder restart from server > services and select the repair option under Node JS service.
Or, it could just be that there is another app running on the same port number… If that’s the case, check the site nginx configs and see if more than one site is on the same port.
I don’t have any other apps running on the same port. I tried restarting the app, and run into the same problem. Looks like the problem is that it’s restarting over and over (see screenshot). I’m digging into dotenv a little to see if that’s related.
Gotcha! There are some additional details typically in the Server > NodeJS heartbeat which may pinpoint what PM2 is seeing as the failure.
Thanks! Taking a look.
Looks like it’s got this error a couple times (i swapped out my site name):
28|open.in | Error: ENOENT: no such file or directory, chdir '/' -> '/home/cleavr/<SITE_NAME>/artifact'
28|open.in | at wrappedChdir (node:internal/bootstrap/switches/does_own_process_state:112:14)
28|open.in | at process.chdir (node:internal/worker:99:5)
28|open.in | at /usr/lib/node_modules/pm2/lib/ProcessContainer.js:298:13
28|open.in | at wrapper (/usr/lib/node_modules/pm2/node_modules/async/internal/once.js:12:16)
28|open.in | at next (/usr/lib/node_modules/pm2/node_modules/async/waterfall.js:96:20)
28|open.in | at /usr/lib/node_modules/pm2/node_modules/async/internal/onlyOnce.js:12:16
28|open.in | at WriteStream.<anonymous> (/usr/lib/node_modules/pm2/lib/Utility.js:186:13)
28|open.in | at WriteStream.emit (node:events:513:28)
28|open.in | at node:internal/fs/streams:75:16
28|open.in | at FSReqCallback.oncomplete (node:fs:198:23)
This error from PM2 is related to the restart loop:
PM2 | Error: listen EADDRINUSE: address already in use :::9718
PM2 | at Server.setupListenHandle [as _listen2] (node:net:1740:16)
PM2 | at listenInCluster (node:net:1788:12)
PM2 | at Server.listen (node:net:1876:7)
PM2 | at Function.listen (/home/cleavr/<SITE_NAME>/releases/20230411071909064/client/node_modules/express/lib/application.js:635:24)
PM2 | at Object.<anonymous> (/home/cleavr/<SITE_NAME>/releases/20230411071909064/client/node_modules/@remix-run/serve/dist/cli.js:44:84)
PM2 | at Module._compile (node:internal/modules/cjs/loader:1254:14)
PM2 | at Object.Module._extensions..js (node:internal/modules/cjs/loader:1308:10)
PM2 | at Module.load (node:internal/modules/cjs/loader:1117:32)
PM2 | at Function.Module._load (node:internal/modules/cjs/loader:958:12)
PM2 | at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:81:12)
PM2 | 2023-04-13T18:21:19: PM2 log: App name:<SITE_NAME> id:29 disconnected
PM2 | 2023-04-13T18:21:19: PM2 log: App [<SITE_NAME>:29] exited with code [1] via signal [SIGINT]
PM2 | 2023-04-13T18:21:19: PM2 log: App [<SITE_NAME>:29] starting in -cluster mode-
PM2 | 2023-04-13T18:21:19: PM2 log: App [<SITE_NAME>:29] online
I dug into this for a while today and haven’t been able to find the root cause yet. From the PM2 monitor, it looks like the issue is that in cluster mode, it’s trying to run multiples of the same app and one of them is in a restart loop because the address is already in use. This is only happening on the remix apps I have running.
Hmmm… I don’t see anything that says Remix doesn’t support cluster_mode on their docs for any particular reason.
You could try SSH’ing, CD to the project path, run pm2 status
to see processes running, then kill the processes using pm2 kill <process number
and then run pm2 start .cleavr.config.js
and see if that clears up the issue.
An alternative would be to set instances to 1
in the deployment workflow > settings > build > PM2 ecosystem. Update to:
...
instances : "1",
...
You’d then need to redeploy the app for the change to take effect.
Did some digging on the Remix discord and it led to the answer. Short version is that pm2 and npm sometimes don’t work well together, so I had to create an express server on my root and run that way. Deployment build settings look like this now:
module.exports = {
name: "...",
script: "./server.js", // this is a basic express server. check remix docs.
log_type: "json",
cwd: "/home/cleavr/staging.open.ink/current", // had to update this to current to file the server file
instances : "max",
exec_mode : "cluster_mode",
env: {
// currently have to include all env vars here, because it's not loading dotenv correctly
// likely will switch to using something like 1password for env vars in the future, so not a big deal
}
}