Re: [tsc-devel] On the server issues
Chris Jacobsen |
Tue, 20 Jan 2015 00:09:14 UTC
Making a program to monitor the sever and restart it if it goes down makes sense. This probably is the first thing we should do now.
Long term we probably do need to resolve the root cause of the issues -- if the CPU goes to 100%, it might cause some intermittent issues for users even if a server restart is initiated. Whether a new version of Debian is better or whether we should do more work to just get the wiki software is up to date -- I'll probably have to leave that to you guys.
-datahead
On Monday, January 19, 2015 6:07 PM, Sydney Dykstra <…d@o…> wrote:
I’m not sure what the best option is. I would almost say go with Debian
Jessie, because it pretty close to releasing as stable, and Jessie has a
long life ahead of it. Also from my experience is very stable. The one
thing i do notice is that i cannot seem to get crontab to work, but that
may just be me.
Also, if we do that, wont it be hard to transfer everything over? Just
thoughts.
-Sydney
On 01/19/2015 05:50 PM, Quintus wrote:
> Hi everyone,
>
> as you might have guessed, I don’t have much time right now. I’ve
> skimmed through the IRC logs of the last days and wanted to comment on
> it a bit.
>
> The setup currently runs a standard Apache httpd on Debian wheezy. The
> forum runs on a Thin server that is reverse-proxied behind the httpd,
> and the wiki runs a Python software named Moinmoin via CGI. All the
> other pages are simple, static HTML pages that are either written
> incrementally (like the chatlogs) or on a regular interval via Cron
> (such as the mailinglist archive), or even manually such as the main
> page.
>
> The reason for the current outages is the Moinmoin wiki software. From
> time to time, it gets cought in an infinite loop with 100% CPU usage,
> blocking httpd completely and having heavy impact on the rest of the
> server. The problem in the wiki software seems to have been resolved in
> newer versions of Moinmoin, but Debian’s policy won’t permit to get a
> more recent version into their repositories.
>
> As has been suggested, the problem thus is _not_ memory usage. There’s
> plenty of free memory on the server. It’s just the 100% CPU problem
> caused by the Python processes and thus it’s not really httpd’s
> fault. If we swapped out httpd for nginx, the problem would occur
> likewise.
>
> This leaves us with the option to either upgrade to Debian Jessie,
> upgrade only Moinmoin to a more recent version and afterwards manually
> take care to keep it up-to-date without the package manager, use another
> wiki software, or put proper monitoring in place that automatically
> restarts faulty daemons. I personally favour the last option, as
> monitoring is something that should be good measure for any server. The
> same goes for backups. Once I’m done with my exams, I’m going to do some
> proper server configuration in this regard. Maybe I can even insert
> something in the meanwhile...
>
> A sidenote for anyone having server access: On login, the MOTD is
> printed to the terminal. It contains the names and valid email addresses
> of all people who have full admin access via sudo. In case you urgently
> need an admin email address, that’s where you want to look.
>
> Also, if you didn’t know yet, I offer SSH (and thus SFTP with download
> area!) access to anyone involved in the TSC project. Just drop me an
> email. If you don’t know where to put your large files to distribute
> them, this is for you.
>
> The most reliable way to reach me currently is by email, either directly
> (see From header of this email), or by sending to the mailinglist. I’ll
> see both, even if I don’t reply immediately.
>
> Valete,
> Quintus
>