Results 1 to 6 of 6
  1. #1
    Junior Member
    Join Date
    Sep 2011
    Posts
    1

    Default Three days to replace server with failed hard disk??!!

    I'm on server #wsl01022, which has been down since Friday (last) night - no email, and all my sites are dead.

    The Fix ETA currently listed is 5:00 pm *Monday*..??!!

    I find it incredible, and beyond frustrating, that it would take so long for you to recover from a server hard disk failure. One would think you guys would plan for failures like this - hard disk failures are routine, and recovery should be equally routine.. Simply pull the old server, install a new server, and restore the data from backup -- in a matter of hours, not days.. and certainly not *three* days. That's just ridiculous -- and unacceptable.

    I've referred many of my clients your way over the years, and have been a loyal client myself since 2002, but that may have to change, in light of recent downtime problems. During last year's catastrophe, my sites were down for days and days, and now this.. I can't afford to depend on a host that isn't dependable.

    PLEASE get this resolved *immediately.*

    -Richard M.

  2. #2
    Junior Member
    Join Date
    Feb 2010
    Posts
    5

    Default

    I am also the server and been using Westhost for many years. I find it equally shocking that it takes 3 days to restore the server from backup. We were down for many days last years, and now another 3 consecutive days of downtime. Three days because of a hard disk failure. I find that amazing.

  3. #3
    Junior Member
    Join Date
    Feb 2010
    Posts
    5

    Default

    The server is up running again. The server ended up being down for 2 days, not 3 days.

  4. #4

    Default

    Richard and Prints

    First sorry for the problems this caused both of you. We understand the importance of providing a reliable hosting solution. Like you mentioned hard disk failures happen. We are prepared for them. We had the hardware ready to replace the bad hardware and did it immediately. We also had a backup that we could restore to the new hardware. Unfortunately because of the amount of data that needed to be restored it took a long time to complete the restoration. We recognize the importance of keeping our clients online and will continue to do everything we can make sure that happens.

    Thanks

    Adam

  5. #5
    Senior Member rolling's Avatar
    Join Date
    May 2004
    Location
    Different day, different place
    Posts
    486

    Default

    I'm a bit surprised that a hard drive failure should take down a server. Do your servers not have RAID arrays?
    Quote Originally Posted by Westhost
    Data Center Specs
    CentOS 5 w/Apache 2 Yes
    Gigabit Ethernet Yes
    RAID 10 Redundancy Yes
    SAS 70 Type II Compliant Yes
    Tier 3 classification Yes
    Onsite WestHost Staff for Monitoring & Service Yes
    Cold Aisle Containment & Cooling Yes
    Redundant Power Sources Yes
    Multiple Network Carriers Yes
    VESDA Smoke Detection System Yes
    Disaster-Safe Location Yes
    24x7 Security and Digital Video Surveillance Yes
    30" Raised Flooring Yes
    If the controller fails, then it should be a matter of swapping the drives from one RAID to another. Or did the controller fail and corrupt all 10 Discs?
    Richard

    I have jotted down some of my meddlings at http://www.rollingr.net/wordpress
    Click here for a full list of formatting codes for this forum

  6. #6
    Moderator ifurniss's Avatar
    Join Date
    Oct 2009
    Location
    Utah
    Posts
    47

    Default

    The stats you are posting are likely not related to your server, since you are hosted in our Site Manager environment if you are on wsl01022. We do have RAID harddrives running in all of our servers, but when a drive fails we do not simply swap out the backup harddrive of the server with the failed drive on yours.

    Data on servers like wsl01022 is spread across multiple drives on the machine. Backups are stored across multiple drives on another machine. When the server crashes in such a way that we cannot bring things back online quicker [ie in this instance failed drive], we install a new and blank harddrive then copy over the data from the backup server. The backup server drive remains where it is, and all data has to then be synced over to the new drive on the server.

    This does not happen often and in most instances our admin team does see the issue before an actual crash and has time to schedule maintenance for a shorter window of replacement. This was not the case this time, and we do apologize for any downtime you experienced, but worked as fast as we could to get things back up and running. Additionally, we probably ran a file system check after all was said and done before we rebooted to make sure things were all intact. That adds to the ETA since a file system check can run for quite some time on a large drive.
    Isaac Furniss
    Technical Support Team Manager
    Contact Support: E-mail | Live Chat | Twitter

    http://www.westhost.com/

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •