(Reposted from our blog)
Hey guys, on Sunday July 21, 2019, we attempted to migrate the master node from Debian stretch to buster. During the migration we had a bug with lxd which prevented the containers from working properly. We bought a server from soyoustart and waited to set things up. While this was going on we had a a rather large DDoS hitting parts of our torrent service. Initially we tried to just migrate and continue with lxd however that didn’t work and cost us considerable time.
The following day Jul 22, 2019 (message-1) I posted to our Telegram communities what information I knew and some things I believed to be accurate, that our server had gone down and our keyserver for the archives was down. Prior to our infrastructure going down we changed the APT archives from stable to rolling without mentioning it officially and this began to cause problems for people.
Our communication with our userbase was slow and clunky. Changing our infrastructure configuration from lxd to Docker-compose made things take much longer. Long term however we think it will speed things up and prevent this kind of incident from happening again. We realize that changing domain names might not have been the right answer but we think it was best to make that change instead of waiting for a later time.
Three days later we sent out a second message Jul 25, 2019 (message-2) describing everything we knew and what we were doing. By this time things for the most part had stabilized.
In summary the things we learned:
- Have a plan.
Internally there was nothing that would point us in the direction of what to do or how to respond. What our sysadmins did was entirely based off their experiences which although it wasn’t bad, we could have done better. The lack of a plan for incident response, and over-reliance on one or two people made things considerably slower both in communication and in execution of recovery.
- Communicate and respond as a team.
Although we have a rather large team (30+ people) the amount of people actually involved was much smaller and fairly spotty, we simply don’t have the pool of sysadmin experience we need. Our poor communication hamstrung efforts to respond and alert users beforehand about what was going on and while the incident occurred. The failure to plan properly for our changes from stable to rolling caused a lot of unnecessary issues which should have been addressed with proper planning and community outreach. Planning as a process is something we will be implementing going forward.
- Have a backup server and sysadmin training.
Part of this is due to a lack of funding which we hope will be resolved in the near future. Also we really appreciate the help and advice received from the staff at Hack The Box (HTB), but we should ensure our sysadmins get some training and familiarity with each part of the infrastructure thus avoiding an over-reliance on one person over another.
Here’s the deal right now we in a bit of a bind. One thing has snowballed into another and though we’ve learned some lessons and will share later, that doesn’t help most of you right now.
Currently, the master key server is offline. That is why you can’t verify that your keys for the rolling URI’s are secure. Unfortunately only one person has the GPG key access needed to sign the dependencies which have not been updated (which is why people are getting the conflicting distribution error or a GPG error if they change their source list to rolling).
All of this happened while we had a server upgrade that went wrong and crashed bringing down most of our core infrastructure. Currently our sysadmins have bought a new server and are bringing our services online as fast as they can. We’ll update you when we have more to share.
I’d like to apologize for the lack of communication over everything. I believe we will have everything up and running within the next 12 hours and I hope to have the dependencies issue fixed as soon as possible.
First I’d like to apologise for the downtime. We’d also like to clarify some things that were said before that are not accurate. Our master server went down due to a bug in lxd during an update. We took a bit too long to deploy new hardware, as initially we tried to fix things. However it didn’t work and we ended up wasting some time.
While this happened we had a rather large DDoS attack hit some parts of our infrastructure. The DDoS made recovery a little slower, but the outage gave us the chance to do two things we’ve been planning for a while which is
to standardize our infrastructure with docker-compose and other safe-to-share technologies. This allows us to share our infrastructure configuration without exposing critical parts of it, without giving third party access to our servers and enabling a fully transparent and reproducible server configurations.
The reason for the domain change is because we changed our name and focus from being solely for pentesters (ParrotSec) to pentesting, security, privacy and development (ParrotOS).
As of right now we have migrated our main website(www.parrot.sh), documentation portal(docs.parrot.sh) and the start page (start.parrot.sh). The forum (community.parrot.sh) has also been migrated however it is still being worked on.
Other services (gitlab for example) will be migrated at a later date and we’ll update you when we have a date.
We’d like to thank the team at HTB (https://www.hackthebox.eu/) for their advice and support.
To clarify some things I said in my last message, about a week and a half ago we made some changes in preparation for our migration to Devuan.
Among those changes were migrating users from stable to rolling and rolling-security as we prepare to maintain two versions of Parrot OS:
rolling which will continue as a rolling release and Long Term Support (LTS).
A sizeable chunk of users have had problems migrating to the new archives and we’d like to apologise for not being so clear or upfront about that change.
I misspoke about our key server, it’s been online the entire time. The issue, for those having problems with the archive rolling-security, was the key wasn’t in the keyserver and thus couldn’t be verified.
This has now been fixed, if anyone still has issues please let us know.
The change in archives coupled with the inability to verify the security of them, caused a lot of people to think they had broken dependencies. The issue should be resolved by ensuring you have the proper source URI’s and execute either
sudo parrot-upgrade or
sudo apt update and
sudo apt full-upgrade
If you are still having issues after adding the correct source lines to parrot.list and running
sudo apt update please post a message in either our Telegram main group or our Matrix channel #parrot:matrix.org
These are the current URI’s you should have in your source file located at
deb https://deb.parrotsec.org/parrot/ rolling main contrib non-free #deb-src https://deb.parrotsec.org/parrot/ rolling main contrib non-free deb https://deb.parrotsec.org/parrot/ rolling-security main contrib non-free #deb-src https://deb.parrotsec.org/parrot/ rolling-security main contrib non-free
Communication is key in any group interaction and we will be working on this going forward.