This case study looks at an incident that occurred within an organisation and the challenges faced by the team. There were simple improvements and changes recommended after the incident to improve processes and prevent potential data loss on mission critical systems.
A system administrator found that multiple anti-virus alerts had been triggered when commencing work. On initial investigation the individual believed the system from which the alerts came was ‘air gapped’ from other highly sensitive, mission critical systems. The individual failed to report the incident for four days, until more alerts prompted a call to the Computer Incident Response Team (CIRT). The CIRT then swiftly responded and began their initial triage and investigation.
The team assisting the CIRT had no knowledge or understanding of what had transpired during the initial call and were unaware of the anti-virus alerts. This lack of communication between the teams was the first challenge. The second and largest challenge was the lack of documentation, network diagrams and understanding of the system. It emerged the affected system was in fact connected to multiple mission critical systems and not ‘air gapped’ as initially thought. Further investigation found that the malware was a self-replicating worm and had spread to almost all the systems connected. Some of these systems were still running Windows 98 and had not had a backup carried out in over 5 years. The final challenge was locating the initial root cause of the infection. After a week of investigation and questions to multiple administrators, it was found that no dedicated sheep-dip machine was used to check removable media before use on the system.
During the ‘lessons learned’ exercise, carried out post-incident, there were many suggested improvements and new procedures recommended to the system owner. The first was as simple as communication between the changing of shifts, and communication from senior management down to the technical administrators. The next was documentation and diagrams. Preparation is of upmost importance in an incident response plan and having the correct and up to date network diagrams and topologies are crucial to stop the spread of potential malware. The final solution recommended to the team was to use only system accredited USB drives, for them to be scanned when they are going to be used on the system and for this action to be logged. During an incident, collecting log data is critical in finding a root cause and building up a timeline of events.