Things to consider before you begin your SOAR Project
Chapter written by Razvan GAVRILA ([email protected]), ENISA, Senior Cyber Expert
Any modern organisation that wants to protect its information assets and mission-critical functions must be able to respond quickly to security events, preferably before they escalate into serious incidents. SOAR, which stands for security orchestration, automation, and response, was created with this goal in mind. It enables organisations to collect data about potential security risks and respond to security events autonomously or with limited human assistance. Furthermore, SOAR forces the company to consider the types of risk scenarios to which it is vulnerable, as well as the appropriate level of expectation (risk appetite) for the company as a whole in terms of the types of dangers that must be addressed, how, and by whom.
The following section discusses some of the primary benefits and challenges of implementing a SOAR system in your organisation. Keep in mind that you will not be able to complete everything overnight, and that your effort must be gradually increased. This section is by no means exhaustive; you should tailor it to your specific needs, industry, and regulatory environment.
As launching a SOAR project requires investment (both human and financial), the organisation must be able to clearly articulate the expected benefits. Since security is sometimes viewed as a burdensome operational expense, the organisation's management should be briefed on why such a project is needed. The following points can assist those tasked with preparing the SOAR business case.
-
SOAR can be a powerful business enabler because it allows organisations to define their security needs in terms of services that can be provided both internally (by a SOC-like function) and externally (by specialised Managed Service Providers or a virtual SOC). As the services can be mapped to specific risks that the organisation wishes to address, such an initiative fits naturally into the organization's larger enterprise risk management programme.
-
SOAR can improve compliance reporting by standardising the incident response process from start to finish. Corporate boards will have a clearer picture of the number of security events detected, their status, and resolution, while security teams will be able to generate these reports using automation. When dealing with third-party service providers, such reports can serve as contract performance indicators, which can be evaluated through dedicated pentests or red team engagements.
-
SOAR encourages organisations to evaluate their security processes. This improves knowledge management and allows for the acquisition of appropriate lessons when a particular process is found to be ineffective or lacking. In essence, this helps to reduce the blame culture by delegating the role of the security operator to process optimisation.
-
Setting up a SOAR system imposes a sense of discipline on how information security playbooks are executed by the organisation. The flow from Tier 1 (basic IT Security Help Resolution) to Tier 3 (security or forensic expert) is unambiguous. SOAR is the opposite of InfoSec adhocry (which is toxic, wasteful, and dangerous).
-
SOAR enables organisations to deal with the high volume and velocity of events generated by a broad set of devices (endpoints, network appliances, security appliances, applications, and so on), and it is the logical next step after a well-designed SIEM. This implies that the security function will be able to triage the mundane more quickly and focus on critical cases.
It is critical to have a solid foundation before embarking on your SOAR journey. Security automation can be disruptive to an organisation that has not addressed the fundamentals: leadership commitment and business owner buy-in. The previous section discussed some of the points you can use to convey your message and gain solid support. Keep in mind that abruptly locking accounts, increasing the number of tickets raised to the IT help desk, or increasing the number of security alerts reported to management can all result in unintended consequences. The organisation can quickly go into shock, jeopardising your project.
Before you begin, ensure that you have the following covered:
-
You should not embark on this journey if your current asset management practises are underdeveloped. If you can't answer questions like: what assets are in scope; who owns them; and what the assigned value per asset is, your playbook logic may be skewed by arbitrary circumstances. The asset list will define the scope of your project and provide an indication of your potential SOAR roadmap (for e.g. start with the endpoints, then servers, etc.).
-
Poor logging practices make it impossible to have a good SOAR in place (too little telemetry or too verbose). Logs give your assets a voice, and they are the single most important piece of the puzzle (especially when you will try to run a complex investigation that a standard SOAR playbook cannot handle). Verify your logging practices are in line with your organisation's risk appetite and compliance obligations (for example, data protection) (e.g. log retention).
-
Speaking of logs, make sure the ones you will use are relevant and complete, meaning they contain the necessary info to do the automation action/follow-up. Try to establish a consistent field naming convention. It can help you unify logs from various sources, making it easier to follow up on automation actions.
-
Always prioritise process over tools and never the other way around. A good SOAR system must consider first what needs to be accomplished and then how. This implies that before mirroring current practises, you should consider what parts of the process can be optimised or eliminated.
-
Organisations should not underestimate the importance of risk assessment. Any response on auto-pilot requires some thought about what is tolerable for the organisation in terms of risk: shutting down systems, blocking connections, locking out accounts, and so on may have serious consequences for the organisation's mission. Check if the business owners are aware of the scenarios and cases in which a playbook may actively change the state of a system.
-
While not required, a well-designed SIEM system can serve as a good building block/enabler in the development of a good SOAR system.
The security playbooks of your organisation will serve as the foundation of your SOAR system. A security playbook is designed to provide a clear picture of security roles and responsibilities in relation to cyber security --- before, during, and after a security incident.
The following section discusses some well-known good practices you may want to consider:
-
SOAR use cases and playbooks must be documented at all times. They should include, at a minimum, the following: the playbook's goal, the owner, the steps, the actors involved, the metrics to be measured, such as mean time to detect (MTTD) and mean time to repair (MTTR), and override options. Also, ensure that these playbooks are easily accessible to those who require them, particularly those in charge of maintaining your SOAR system.
-
Your playbooks should ideally be neatly mapped on top of the standard incident management lifecycle: detect, triage, prioritise, respond, recover, and report. While this may appear difficult, it will give you a better understanding of what the auto process is doing and will make upstream reporting easier.
-
Before actively deploying a playbook, it's a good idea to run scenarios or stress tests to determine the true throughput of the playbook. For example, you might want to know how many alerts your team can handle per day, week, quarter, or year. Conducting such stress tests is critical because any human action or decision in the workflow can become a bottleneck.
-
Create distinct success criteria for each playbook. For playbooks that actively change the state of a system and require a higher level of confidence, try to improve the decisions by incorporating enrichment sources (e.g., commercial APIs or CTI feeds).
-
Make sure you can distinguish between high-risk playbooks, which can have a wide-ranging impact on the organisation's business (e.g., shutting down servers, changing firewall rules), and lower-risk ones, which include blocking a user account after multiple failed logins, blocklisting email used in phishing, and so on.
-
Regularly review the playbooks and make sure that the analysts' lessons learned are captured in any new iteration. You may wish to establish a regular cycle, which can come in the form of dedicated workshops with your Tier 1 security investigators or external experts, such as red teamers. Consider using well-established frameworks to manage your playbook use cases, such as the Dutch Management, Growth and Metrics & assessment (MaGMa) Framework.
SOAR, like any other system, has its own set of challenges. This section attempts to highlight some of them briefly:
-
If your playbooks are consuming and acting directly on threat intelligence from CTI feeds or platforms, you should proceed with caution. Establishing a trust hierarchy (both in terms of sources and classes of observable) will reduce the number of false positives, but it is not always an easy task.
-
When using Managed Service Providers or virtual SOC services, organisations must consider data protection and intellectual property implications, as the DFIR process can be both intrusive and (in some cases) illegal under GDPR provisions (purpose and scope of collection and processing, data transfers etc.). In such cases, the only tools available to meet compliance needs will be contractual in nature, as technical controls, such as anonymisation and pseudo-anonymisation techniques, may reduce the playbook's direct benefits.
-
SOARs require refinement and are not appropriate for ingesting large amounts of raw data. As a result, having a good SIEM in place can help to streamline a successful SOAR.
-
Finding the right balance between the responsiveness of automated tasks and human or business judgment is not always easy. For example, shutting down a system may prevent an attack from spreading, but it may have serious consequences if the system does not turn back on.
-
While SOAR can be an excellent addition to an organization's executive reporting, no system can generate an executive/board ready report following an investigation into a serious security incident. In essence, no SOAR system (at least with the current available technology) can replace a skilled and experienced information security analyst.
A word of caution about becoming technology/vendor dependent with your SOAR system: while embracing a specific product can result in clear benefits (for example, fast on-boarding, good support, and easy integration with the target ecosystem), keep in mind that a good SOAR deployment should be easily decoupled from the environment in which it operates (in particular if that environment is a Cloud Service Provider).
Finally, while a well-designed SOAR system can provide many benefits, it cannot replace a strong enterprise security and risk management culture. Because the threat landscape is constantly changing, deploying SOAR playbooks should not be viewed as a one-size-fits-all solution for all "security" issues. For example, the inherited risk of third-party libraries (weakness in the software supply chain) requires the use of a mix of proactive (trusted repositories, quality assurance, etc.) and reactive strategies (isolating potentially dangerous code). Putting SOAR in the middle of all of this may lead to additional complexity, so the advice is to keep the playbooks simple and "human readable."