Announcement

Collapse
No announcement yet.

well I lost everything on my server

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    well I lost everything on my server

    few weeks ago my PE 1800 fans started to spin at full speed. No matter how much I reset the cmos or rebooted, it kept doing it. One thing it said was the Baseboard was unable to connect.

    after research, I need to update the baseboard to fix it. I have ubuntu 16.04 and tried to update it with the linux update from dell. simple binary. but, it was redhat, I figured it would work. (I know ubuntu is debian based)

    I sudo'd the executable and found a bunch of text scrolling past me, I wasn't able to read all of it. Saw it trying to change permission on /pci/xx/xx/xx devices I thought, oh, it's enumerating the onboard baseboard store, getting ready to update it.

    turns out, it was going through my raid 1 array fing up all the permissions on well, everything.

    after that, it would boot but not login (unable to access my /home). Also a bunch of stuff failed to start

    So I made a win7 PE, tried to run the windows updater, memory address blocks, instruction issues

    I got pissed and hooked a laptop hard drive up, and a dvd drive up (only two sata ports). I installed windows xp on the laptop hard drive for the sole purpose of running this executable. It ran, updated the baseboard from 1.5 to 1.8. Right before it was done, my fans spun down to normal.

    so, ubuntu is messed up, i'll do what I always do. Take one of the Raid 1 drives, and wipe it, install with raid 1, degraded, without the other hard drive there. Attach the hard drive, copy all my stuff to the new install (this time 18.04) and rebuild the array

    but the second drive, was so corrupt. Trying to get it to mount, trying to get mdadm to access it for mounting. Just didn't work. I think I remember seeing fsck scanning it and I think that might have caused the problems.

    all my tools, all my games, all my videos. Backups for customers I have luckily have not needed to access. gone. Next time i'll try to hook it up to a windows ext access utility before fsck ever has a chance to do anything

    I am partially to blame, I didn't realize running an executable for redhat would, recursively go through my array and fuck up all my permissions
    Cap Datasheet Depot: http://www.paullinebarger.net/DS/
    ^If you have datasheets not listed PM me

    #2
    Re: well I lost everything on my server

    I can't seem to preach this enough: ALWAYS KEEP OFF-SERVER BACKUPS!!

    http://www.paullinebarger.net/DS/
    Yes, things seem to be broken. I hope you get it sorted out.
    <--- Badcaps.net Founder

    Badcaps.net Services:

    Motherboard Repair Services

    ----------------------------------------------
    Badcaps.net Forum Members Folding Team
    http://folding.stanford.edu/
    Team : 49813
    Join in!!
    Team Stats

    Comment


      #3
      Re: well I lost everything on my server

      hmm, I wonder if I can SSH back it to my webhost, I have "unlimited storage". There a package for that?
      Cap Datasheet Depot: http://www.paullinebarger.net/DS/
      ^If you have datasheets not listed PM me

      Comment


        #4
        Re: well I lost everything on my server

        Originally posted by Uranium-235 View Post
        few weeks ago my PE 1800 fans started to spin at full speed. No matter how much I reset the cmos or rebooted, it kept doing it. One thing it said was the Baseboard was unable to connect.

        after research, I need to update the baseboard to fix it. I have ubuntu 16.04 and tried to update it with the linux update from dell. simple binary. but, it was redhat, I figured it would work. (I know ubuntu is debian based)

        I sudo'd the executable and found a bunch of text scrolling past me, I wasn't able to read all of it. Saw it trying to change permission on /pci/xx/xx/xx devices I thought, oh, it's enumerating the onboard baseboard store, getting ready to update it.

        turns out, it was going through my raid 1 array fing up all the permissions on well, everything.

        after that, it would boot but not login (unable to access my /home). Also a bunch of stuff failed to start

        So I made a win7 PE, tried to run the windows updater, memory address blocks, instruction issues

        I got pissed and hooked a laptop hard drive up, and a dvd drive up (only two sata ports). I installed windows xp on the laptop hard drive for the sole purpose of running this executable. It ran, updated the baseboard from 1.5 to 1.8. Right before it was done, my fans spun down to normal.

        so, ubuntu is messed up, i'll do what I always do. Take one of the Raid 1 drives, and wipe it, install with raid 1, degraded, without the other hard drive there. Attach the hard drive, copy all my stuff to the new install (this time 18.04) and rebuild the array

        but the second drive, was so corrupt. Trying to get it to mount, trying to get mdadm to access it for mounting. Just didn't work. I think I remember seeing fsck scanning it and I think that might have caused the problems.

        all my tools, all my games, all my videos. Backups for customers I have luckily have not needed to access. gone. Next time i'll try to hook it up to a windows ext access utility before fsck ever has a chance to do anything

        I am partially to blame, I didn't realize running an executable for redhat would, recursively go through my array and fuck up all my permissions
        Sounds like possibly a new malware...
        ASRock B550 PG Velocita

        Ryzen 9 "Vermeer" 5900X

        16 GB AData XPG Spectrix D41

        Sapphire Nitro+ Radeon RX 6750 XT

        eVGA Supernova G3 750W

        Western Digital Black SN850 1TB NVMe SSD

        Alienware AW3423DWF OLED




        "¡Me encanta "Me Encanta o Enlistarlo con Hilary Farr!" -Mí mismo

        "There's nothing more unattractive than a chick smoking a cigarette" -Topcat

        "Today's lesson in pissivity comes in the form of a ziplock baggie full of GPU extension brackets & hardware that for the last ~3 years have been on my bench, always in my way, getting moved around constantly....and yesterday I found myself in need of them....and the bastards are now nowhere to be found! Motherfracker!!" -Topcat

        "did I see a chair fly? I think I did! Time for popcorn!" -ratdude747

        Comment


          #5
          Re: well I lost everything on my server

          Originally posted by Topcat View Post
          I can't seem to preach this enough: ALWAYS KEEP OFF-SERVER BACKUPS!!
          I keep 2 on 2 different locations just because I hate loosing things.

          Comment


            #6
            Re: well I lost everything on my server

            Yeah, you need to watch that. Running it from the wrong path, or with unset variables can cause chaos. Take the Steam Client for an example: https://www.theregister.co.uk/2015/0...ans_linux_pcs/ I got bit by that bug. Took out the drives being backed up and the drives that were taking the backup... Both my "production" data and backup data gone like that.
            Last edited by goontron; 07-25-2018, 10:42 PM.
            Things I've fixed: anything from semis to crappy Chinese $2 radios, and now an IoT Dildo....

            "Dude, this is Wyoming, i hopped on and sent 'er. No fucking around." -- Me

            Excuse me while i do something dangerous


            You must have a sad, sad boring life if you hate on people harmlessly enjoying life with an animal costume.

            Sometimes you need to break shit to fix it.... Thats why my lawnmower doesn't have a deadman switch or engine brake anymore

            Follow the white rabbit.

            Comment


              #7
              Re: well I lost everything on my server

              Originally posted by goontron View Post
              Yeah, you need to watch that. Running it from the wrong path, or with unset variables can cause chaos. Take the Steam Client for an example: https://www.theregister.co.uk/2015/0...ans_linux_pcs/ I got bit by that bug. Took out the drives being backed up and the drives that were taking the backup... Both my "production" data and backup data gone like that.
              seems something you could sue valve for
              Cap Datasheet Depot: http://www.paullinebarger.net/DS/
              ^If you have datasheets not listed PM me

              Comment


                #8
                Re: well I lost everything on my server

                Originally posted by brethin View Post
                I keep 2 on 2 different locations just because I hate loosing things.
                Yup, though even that may not be sufficient!

                I keep "cold" backups of everything -- two on rust and at least one on another medium (tape, MO, CD/DVD, etc.).

                Some years ago, I had a "disk crash" (or, so i thought!). So, I shrugged and pulled the (first!) cold backup out and mounted it in an external case (SCSI-based system). This disk proved unreadable (WTF?). Maybe bad luck or a problem with the enclosure/cabling?

                Set it aside and pulled (SECOND!) backup out and repeated the exercise. And, should NOT have been surprised to see the exact same results!

                Now we KNOW something is seriously hosed!!

                So, I pulled out the MO backup, cabled an MO drive to the system and restored the drives from the MO (largely read-only) backup.

                Turns out the OS I had upgraded, previously, had a bug in the quirks table for the drives that I happened to be using as my cold backups. Mounting any of them (without manually installing the "read-only" jumper on the drive itself) would result in the superblock being trashed.

                So, roll back to an earlier OS release, copy the MO image onto both cold backups (which, not surprisingly, actually DO work!) and only lose a day of my time (and a few years off my life from the stress).

                Now, I keep multiple copies of "stuff" and in varied places. I have a database that tells me which files (and their MD5's) are located on which media so if I lose any copy of a file, I can quickly locate a backup copy of it, regardless of where it may be stored.

                Comment


                  #9
                  Re: well I lost everything on my server

                  A backup isn't a backup until it's been verified.

                  Comment


                    #10
                    Re: well I lost everything on my server

                    This is why I setup my server to back up it's RAID10 array to a separate drive every week and then to clear out old backups so it doesn't overflow. It's still in-server, but it's on a separate drive controller, so I consider that to be good enough.
                    sigpic

                    (Insert witty quote here)

                    Comment


                      #11
                      Re: well I lost everything on my server

                      this is a good reason to try to use ZFS for backups.
                      everything gets checksumed.

                      Comment


                        #12
                        Re: well I lost everything on my server

                        Well this linux raid is being a pain. I created new partitions on the disk to join, and even though it's the same number of blocks, says it's too small. Ugh. I have s perc 5 card I managed to get out of an old customers workstation cause he had a crashed 5 array and I convinced him to use intel on board raid 1 (that and the card started to have an odd pcie conflict with another device)

                        Not sure the impi will enumerate it for control on a server this old though
                        Cap Datasheet Depot: http://www.paullinebarger.net/DS/
                        ^If you have datasheets not listed PM me

                        Comment


                          #13
                          Re: well I lost everything on my server

                          not sure if running nautilus will help here but thought i would mention it anyway ..

                          Comment


                            #14
                            Re: well I lost everything on my server

                            Originally posted by diif View Post
                            A backup isn't a backup until it's been verified.
                            A backup can be successfully verified and still found to be defective when it is eventually needed! E.g., both of my "cold backups" were actually intact -- it was the OS that had been compromised which rendered them inaccessible at a much LATER date.

                            My "system" tracks the last time it "examined" each volume in the database. When a volume is encountered, it determines which files on that volume have not been re-verified in a particular number of days and starts a task running to read the file in its entirety and verify the checksum computed matches the checksum stored for that file in that folder on that volume in the database. If so, it records the timestamp for that "verification" and then moves on to process the next such file.

                            If the database indicates files that haven't been checked in some "verification interval", it emails me to mount those volumes so they can be examined.

                            This happens regardless of where (which network node) the devices are mounted and regardless of the media involved. E.g., my CDs, DVDs, MOs, thumb drives, drives in sleds, external USB drives, etc. are all managed with the same mechanism (though I can schedule different "verification intervals" for each volume to manage the amount of manual intervention that are required of me). Do you know that the pile of optical media you've written over the years are still intact? If you don't care, then why hold onto them??

                            Because of this, any time I happen to mount a volume for "some other reason", the contents of the volume can be checked "for free".

                            When a discrepancy is encountered (file can't be read, file not found, checksum mismatch), the database tells me where I can find a "copy" of that particular file so that the defective copy can be repaired.

                            Does your RAID array tell you if ALL the files you are NOT accessing, now, are intact? Do you have to verify its entire contents in order to reassure yourself that it is intact? Are ALL of the files on that medium equally important to you? Do you really want to verify the ISO images of the installation CDs -- which you happen to have squirreled away in a desk drawer -- just because they happen to reside on THAT array? Or, would you be equally confident verifying them every month or three -- KNOWING that the masters also exist on non-rust?

                            Comment


                              #15
                              Re: well I lost everything on my server

                              Originally posted by brethin View Post
                              I keep 2 on 2 different locations just because I hate loosing things.
                              As do I. I keep 2 external drives in my safe on the property, and I've got another in a safe deposit box. RAID's are nice, but that's a common misconception that so many make....RAID's provide redundancy, not backup!
                              <--- Badcaps.net Founder

                              Badcaps.net Services:

                              Motherboard Repair Services

                              ----------------------------------------------
                              Badcaps.net Forum Members Folding Team
                              http://folding.stanford.edu/
                              Team : 49813
                              Join in!!
                              Team Stats

                              Comment


                                #16
                                Re: well I lost everything on my server

                                Originally posted by stj View Post
                                this is a good reason to try to use ZFS for backups.
                                everything gets checksumed.
                                But you then need ZFS to access the medium. What do you do if the box(es) that support it are down? How do you implement it on already written WORM media? etc.

                                The same problem applies to the various RAID technologies. When your RAID hardware (or system) dies, how do you access (or recover) the contents of those volumes?

                                I perform the checksums in-band and deliberately store them ON ANOTHER MACHINE (which can be replicated). There's nothing magical about the volumes that I'm checking -- no reliance on particular hardware (can you pull a drive from a Synology RAID array and install it in a "software RAID" box and expect to access its contents?) There's nothing magical about the filesystems being used -- I can check FAT12 floppies just as easily as EXTFS2 or NTFS or...

                                And, I can mount a volume on any machine (with compatible hardware -- SCSI drives obviously need a SCSI HBA for access) and still gain access to the data.

                                The cost of keeping a spare HBA or SCSI enclosure is trivial compared to keeping a spare RAID *box* (that claims to be compatible with the other boxes you might have).

                                [You learn these lessons when you discover the hardware to access various types of media that you've used over the decades are suddenly not obtainable. Or, the support for them (OS drivers) has disappeared. Do you scurry to move all of that data forward onto new media? (how do you know i is intact when you do so?) Or, do you try to maintain legacy hardware to make it accessible in its original form? (What will you do when you can't buy CD/DVD drives anymore?)]

                                And, because I have the checksums (MD5s) for all of these files available, I can find likely duplicates just by querying the database: two files (which might have different names and reside in different folder -- on different volumes OR ON THE SAME VOLUME) that share a checksum value are likely the same -- or, I can be prompted to make both available to the system so it can make that determination (and record it!).

                                This has already been helpful in identifying duplicate copies of files that I did not care to maintain (e.g., "805-1709-12.pdf" and "Sun Ultra 60 Service Manual.pdf" are identical documents differing only in the name that I assigned to them and the folders I stuffed them into!)

                                Comment


                                  #17
                                  Re: well I lost everything on my server

                                  Originally posted by Curious.George View Post
                                  But you then need ZFS to access the medium. What do you do if the box(es) that support it are down? How do you implement it on already written WORM media? etc.

                                  The same problem applies to the various RAID technologies. When your RAID hardware (or system) dies, how do you access (or recover) the contents of those volumes?

                                  I perform the checksums in-band and deliberately store them ON ANOTHER MACHINE (which can be replicated). There's nothing magical about the volumes that I'm checking -- no reliance on particular hardware (can you pull a drive from a Synology RAID array and install it in a "software RAID" box and expect to access its contents?) There's nothing magical about the filesystems being used -- I can check FAT12 floppies just as easily as EXTFS2 or NTFS or...

                                  And, I can mount a volume on any machine (with compatible hardware -- SCSI drives obviously need a SCSI HBA for access) and still gain access to the data.

                                  The cost of keeping a spare HBA or SCSI enclosure is trivial compared to keeping a spare RAID *box* (that claims to be compatible with the other boxes you might have).

                                  [You learn these lessons when you discover the hardware to access various types of media that you've used over the decades are suddenly not obtainable. Or, the support for them (OS drivers) has disappeared. Do you scurry to move all of that data forward onto new media? (how do you know i is intact when you do so?) Or, do you try to maintain legacy hardware to make it accessible in its original form? (What will you do when you can't buy CD/DVD drives anymore?)]

                                  And, because I have the checksums (MD5s) for all of these files available, I can find likely duplicates just by querying the database: two files (which might have different names and reside in different folder -- on different volumes OR ON THE SAME VOLUME) that share a checksum value are likely the same -- or, I can be prompted to make both available to the system so it can make that determination (and record it!).

                                  This has already been helpful in identifying duplicate copies of files that I did not care to maintain (e.g., "805-1709-12.pdf" and "Sun Ultra 60 Service Manual.pdf" are identical documents differing only in the name that I assigned to them and the folders I stuffed them into!)
                                  MD5 is obsolete, FFS! For example, for ISOs, SHA is regularly used now.
                                  ASRock B550 PG Velocita

                                  Ryzen 9 "Vermeer" 5900X

                                  16 GB AData XPG Spectrix D41

                                  Sapphire Nitro+ Radeon RX 6750 XT

                                  eVGA Supernova G3 750W

                                  Western Digital Black SN850 1TB NVMe SSD

                                  Alienware AW3423DWF OLED




                                  "¡Me encanta "Me Encanta o Enlistarlo con Hilary Farr!" -Mí mismo

                                  "There's nothing more unattractive than a chick smoking a cigarette" -Topcat

                                  "Today's lesson in pissivity comes in the form of a ziplock baggie full of GPU extension brackets & hardware that for the last ~3 years have been on my bench, always in my way, getting moved around constantly....and yesterday I found myself in need of them....and the bastards are now nowhere to be found! Motherfracker!!" -Topcat

                                  "did I see a chair fly? I think I did! Time for popcorn!" -ratdude747

                                  Comment


                                    #18
                                    Re: well I lost everything on my server

                                    Originally posted by RJARRRPCGP View Post
                                    MD5 is obsolete, FFS! For example, for ISOs, SHA is regularly used now.
                                    MD5 is obsolete due to its vulnerability to HACKING/cracking. It is still robust enough to produce a unique signature of any nontrivial file contents without concern for collisions. I.e., ask yourself what type of data corruption would cause a file's contents to change in such a way that there would be an undetectable collision in the MD5 wrt the "correct" contents.

                                    All I use the signature for is to verify that the contents of the file appear to be unaltered -- WITHOUT having to do a bytewise compare to another copy of the file (which may not be "online" at the moment).

                                    MD5 is, on average, faster to compute than any of the SHA variants when the host platform -- as well as file size -- is variable. My goal, of course, is to process as many files as quickly as I can so the user doesn't have to "wait" while the system runs around checking things.

                                    (You want to be able to mount a volume to access something of interest to YOU, not to cater to the system's need to check files. The system, OTOH, wants to exploit every opportunity it has to access the files on that volume so it can vouch for their integrity, NOW.)
                                    Last edited by Curious.George; 07-26-2018, 03:28 PM.

                                    Comment


                                      #19
                                      Re: well I lost everything on my server

                                      Originally posted by Curious.George View Post
                                      MD5 is obsolete due to its vulnerability to HACKING/cracking. It is still robust enough to produce a unique signature of any nontrivial file contents without concern for collisions. I.e., ask yourself what type of data corruption would cause a file's contents to change in such a way that there would be an undetectable collision in the MD5 wrt the "correct" contents.

                                      All I use the signature for is to verify that the contents of the file appear to be unaltered -- WITHOUT having to do a bytewise compare to another copy of the file (which may not be "online" at the moment).

                                      MD5 is, on average, faster to compute than any of the SHA variants when the host platform -- as well as file size -- is variable. My goal, of course, is to process as many files as quickly as I can so the user doesn't have to "wait" while the system runs around checking things.

                                      (You want to be able to mount a volume to access something of interest to YOU, not to cater to the system's need to check files. The system, OTOH, wants to exploit every opportunity it has to access the files on that volume so it can vouch for their integrity, NOW.)
                                      I fear that a collision in the future, can result in failure to detect corruption.

                                      Even though it's much better than CRC, with files. I saw MD5 do a GJ with optical drive mis-reads, IIRC.
                                      Last edited by RJARRRPCGP; 07-26-2018, 06:03 PM.
                                      ASRock B550 PG Velocita

                                      Ryzen 9 "Vermeer" 5900X

                                      16 GB AData XPG Spectrix D41

                                      Sapphire Nitro+ Radeon RX 6750 XT

                                      eVGA Supernova G3 750W

                                      Western Digital Black SN850 1TB NVMe SSD

                                      Alienware AW3423DWF OLED




                                      "¡Me encanta "Me Encanta o Enlistarlo con Hilary Farr!" -Mí mismo

                                      "There's nothing more unattractive than a chick smoking a cigarette" -Topcat

                                      "Today's lesson in pissivity comes in the form of a ziplock baggie full of GPU extension brackets & hardware that for the last ~3 years have been on my bench, always in my way, getting moved around constantly....and yesterday I found myself in need of them....and the bastards are now nowhere to be found! Motherfracker!!" -Topcat

                                      "did I see a chair fly? I think I did! Time for popcorn!" -ratdude747

                                      Comment


                                        #20
                                        Re: well I lost everything on my server

                                        Originally posted by RJARRRPCGP View Post
                                        I fear that a collision in the future, can result in failure to detect corruption.

                                        Even though it's much better than CRC, with files. I saw MD5 do a GJ with optical drive mis-reads, IIRC.
                                        Realistically, the sorts of errors that will manifest will be:
                                        • file not found
                                        • unrecoverable read errors
                                        • drive failure


                                        Keep in mind that, unlike ZFS, RAID, etc. my scheme tolerates the volumes being accessed "unsupervised". E.g., I can take an external USB drive and manually change "something" -- or someTHINGS -- without the system ever seeing me make those changes. So, I can delete a file -- or rename it, or move it, etc. -- and the system will not see me doing those things (so that it can update its notion of the file's new name, location, etc.).

                                        I can likewise make changes to the file's content while "out of sight" and it won't know to update the checksum (signature) stored in the database to reflect those changes.

                                        Removable media (esp CD/DVD) can fail while offline and throw UREs. Again, something that doesn't happen with RAID/ZFS/etc. (volumes are never really "offline" while the rest of those system is running).

                                        And, of course, a drive can always have a catastrophic failure (fail to spin up).

                                        Note that most archive formats (ZIP, ARC, RAR, etc.) rely on simple checksums to vouch for the integrity of their contents. How often have you encountered one that fails to self-verify after it had previously done so?

                                        Compute the MD5 of this message. Then, alter it in such a way that its length and MD5 remain unchanged. Then, try to convince me that your alterations are representative of a likely hardware/media failure!

                                        Comment

                                        Working...
                                        X