PDA

View Full Version : E-Mail help



erich
04-30-2004, 10:38 AM
For the last few weeks we have been having trouble getting e-mail delivered. Forum members have been very helpful with instructions on using SSH to show the mailq and run sendmail -q -v.

Today, that doesn't work. I get the message "Skipping queue run -- load average too high." There are 145 undelivered e-mails in our queue and I can't get them to send/deliver!

I have filed a ticket with WestHost, but I'm not sure I will receive their reply since my queue isn't being cleared out.

I'd like to know the following...

1. What is causing the problem?

2. What can be done to correct it for now?

3. What will be done in the future so we don't experience this again?

We should not have to log in with putty to process our mail queue.

I need my e-mail delivered! It's not like this is a new problem today...it has been ongoing.

Thanks,

Erich

jalal
04-30-2004, 11:26 AM
One of my customers had the same problem this morning, SSH on the server was horribly slow as well.
I filed a ticket with Westhost and 2 hours later it seems to be working correctly. I have no idea what the problem was tho.

HTH

Wiggins
04-30-2004, 12:31 PM
For the last few weeks we have been having trouble getting e-mail delivered. Forum members have been very helpful with instructions on using SSH to show the mailq and run sendmail -q -v.

Today, that doesn't work. I get the message "Skipping queue run -- load average too high." There are 145 undelivered e-mails in our queue and I can't get them to send/deliver!

Erich

Yep I am experiencing it too. Sendmail has a configuration setting to not take up system resources in the event the system is overloaded, which is why the -q would give that message.

From what I can tell there is no way to get the load average from within the VPS. It is the load on the overall system (aka all VPS combined) that is causing the issue. In general it is affecting everything on the system, I have seen noticeable delay in SSH, mysql connectivity, HTTP response, as well as mail handling. This morning NeoMail ate my mail spool.

Generally this is caused by a DOS (aka denial of service), which is either an internal or external attack, or someone on the system is eating all of the resources. Not sure if the VPS allows process throttling, but you would think WH could check the overall system health and determine the problem.

That or it is time for WH to upgrade their hardware as they have too many clients on one system, which has always been my fear of an account like this.

It would be nice to hear something realistic, beyond we know there is a problem and it will be fixed. Disappointed,

WdA

Wiggins
04-30-2004, 12:59 PM
From what I can tell there is no way to get the load average from within the VPS.

Ok so there is a way to get the load average, I just won't bother disclosing it here to save WH the headache of unknowing users asking questions they shouldn't. The load on the box on my account just before I posted my previous message was apparently outrageous, at least for the suspected type of hardware.

Though it would be nice to know what a common load average is on one of these servers, and whether WH monitors loads, which I presume they do since they should.

Interestingly just before I started checking the average was 36+, since checking (aka around the time I sent my first message) the load average never went above 17, and was more frequently around 8-12....

12:38pm 10.80 18.46 36.65
12:39pm 10.49 18.27 36.49
12:39pm 9.71 17.85 36.16
12:39pm 9.71 17.85 36.16
12:39pm 8.93 17.56 35.97
12:39pm 8.93 17.56 35.97
12:39pm 8.03 17.08 35.61
12:39pm 10.33 17.00 35.19
12:39pm 8.30 14.79 32.47
12:41pm 7.71 14.56 32.30
12:41pm 7.71 14.56 32.30
12:41pm 11.19 14.87 32.03
12:42pm 9.24 14.26 31.55
12:42pm 12.27 14.72 31.43
12:42pm 13.82 14.65 30.60
12:43pm 13.04 14.38 30.17
12:44pm 8.17 12.91 28.92
12:44pm 7.25 12.29 28.29
12:44pm 16.27 12.89 26.65
12:46pm 15.77 12.84 26.56
12:46pm 17.93 13.96 26.08
12:47pm 9.10 12.18 24.78
12:47pm 4.98 9.87 22.54
12:50pm 8.34 9.96 22.03
12:51pm 17.30 12.07 21.81
12:57pm 6.28 10.12 18.47

WdA

erich
04-30-2004, 01:06 PM
The load on the box on my account just before I posted my previous message was apparently outrageous, at least for the suspected type of hardware.

WdA


I was finally able to clear out our queue after about five hours. I don't know if WH did something or if the load just decreased.

But, my three questions above still remain as well as the fact that we should not need to login to process the queue.

Erich

j103c
04-30-2004, 02:36 PM
Do you guys know what server you are on?

I have an account on server 6 that I am suspicious of this same thing happening. I posted in another thread, but somewhat self-discounted WH server-side load/config, but this makes me suspiscious again. The mail queue on that account normail has 15-60 e-mails sitting in it lately.

Wow - I just looked again:


/var/spool/mqueue (286 requests, only 20 printed)

:shock:

I have an account on another server, but the mail is handled by a third-party server and WH just relays e-mails on to it. There doesn't seem to be any delays with those..

erich
04-30-2004, 02:44 PM
Do you guys know what server you are on?

Server 3.

Erich

jalal
04-30-2004, 02:57 PM
Interestingly just before I started checking the average was 36+, since checking (aka around the time I sent my first message) the load average never went above 17, and was more frequently around 8-12....

I've ran some checks on the load averages on my server and they average somewhere between 0.25 and 0.35 (with peaks at 0.45).

Are you multiplying your figures by 10???

:?:

Wiggins
04-30-2004, 03:41 PM
Interestingly just before I started checking the average was 36+, since checking (aka around the time I sent my first message) the load average never went above 17, and was more frequently around 8-12....

I've ran some checks on the load averages on my server and they average somewhere between 0.25 and 0.35 (with peaks at 0.45).

Are you multiplying your figures by 10???

:?:

Nope taking it directly from what the system is providing. Are you on shared or dedicated?

WdA

FZ
04-30-2004, 04:04 PM
I'm on (shared) server 20. I don't see any major problems, but I have noticed a few e-mails taking a long time to get through (and one or two in my mail queue) over the past 2 days as well. Most recent load averages on my server: 9.98 6.07 4.83 (these are slightly higher than what I had when I checked about 2 hours ago). [I'm not multiplying it by 10 either, jalal ;)]

WestHost - MMellor
04-30-2004, 04:07 PM
Hello Everyone,

As most of you are probably aware we were having some server issues yesterday. We know what the problem was and we have fixed it so it will never happen again. We appreciate all of your patience with this and we apologize for any inconvenience that this has caused.