Here Tony and John discuss the various ways in which to gather risk event information and manage it using operational risk software. What information should be collected, who reports it, the effect on potential whistleblowers, and the importance of reporting thresholds.
Taken from: Mastering Risk Management
Who reports the data?
In a firm with the right risk culture, it is understood that reporting events is not about blame, but about learning. Many firms have a loss reporting form on their intranet which is available to all staff. In this case, staff are encouraged to report events as they happen.
Some firms allow anonymous reporting of losses, while most require the name of the person who detected the event. But the person who reports the event may not necessarily be the person who detects it. And the person who reports it many not necessarily be the person responsible for the event occurring, if anyone is responsible. Some firms require the name of the person’s manager and will send an automatic email to the manager as validation and confirmation of the event. That can, however, discourage whistle-blowing, which can be a useful source of identifying potential or actual high-impact/low-frequency events.
An alternative is for the risk leader or champion within the detecting department or business line to make the report. The advantage of this approach is that a risk leader will have some training in risk and will probably be a user of loss data. This should mean that the data will be of a higher quality compared with data submitted by an untrained person. However, there may be a time delay compared with a submission by the employee who discovers the event.
Reconciling losses to the general ledger (or an audit) will provide valuable confirmation and validation of the accuracy of reporting. However, it will not identify events where there is no financial impact or, of course, events which have not been reported.
The reporting threshold, the level down to which a firm seeks to capture risk events, is a cost-benefit decision. The Basel Committee has set a threshold of €10,000 for loss reporting by banks. It is interesting that, in a survey of over 100 financial services firms from around the world, most had a threshold of €5000 or below. A number of firms, including banks, have a policy to report all losses, no matter what size. This all-inclusive approach of course leads to a large number of events being reported, although at least it means that all events are captured.
The risk management departments of many firms which have set a reporting size limit generally analyse losses only at a higher level in order to keep the workload within reasonable bounds. It is worth noting that several smaller losses can add up to one larger loss which is above the analysis threshold. In this way continual small control failures are captured before they turn into a significant value. Weak signals can often demand strong action.
Many firms believe that capturing small losses costs more than the benefit achieved. However, a reporting threshold above zero will prevent a significant amount of control failure being captured, including the majority of loss events which in fact have a zero financial impact. These, and events where there have been small losses, will be picked up by successfully implementing a zero reporting threshold. Although large losses will still be captured and can be analysed, it is easily arguable that the firm will miss data today which could prevent a large loss from happening tomorrow.
Setting the reporting threshold is therefore a significant issue, the consequences of which should be properly understood by at least the risk committee and preferably the board.
Given the valuable business uses to which events and losses – and gains- can be put, the next step in the process is to decide what information should be gathered about the events and losses. The information collected will vary from firm to firm, but there is a minimum set of data attributes which is collected:
- Name of the firm in which the event occurred, if it’s in a group of companies
- Geographic location of the event
- Business activity
- Loss event type, down to a detailed level
- The event start date, discovery data (and end date, if the event has finished)
- Description of the event
- Causes of the event
- Amount of loss and recovery components
- Management actions taken.
Name of the firm
This may seem obvious, but in a group of companies more than one firm may be involved. Often both the name of the organisation in which the event occurred and the name of the organisation in which the event is detected are recorded as both of these are important from a risk management and control improvement perspective. In addition, data may be held relating to the firm which will suffer any loss, as this may be different from the firm in which the event occurred and the firm which detected the event.
Recording where an event happens is important from a risk management perspective. There may be control weaknesses which are inherent in a particular location or as compared with other locations. This may indicate a better or worse control culture. Either way, it is vital to understand each area’s control ability so that decisions on improving controls can be taken based on knowledge, rather than take a blanket approach, possibly based on guesswork.
Identifying the particular business activity or product line is useful, especially in a group where business units in different companies may be involved in the same activity or in selling similar products. It helps to achieve consistent reporting, both within the group and if external reporting is necessary, perhaps to a government body or regulator, although it is not often that the taxonomy of external reports conforms to internal ones. Recording the business activity can also identify units where controls which are operated across a particular activity have failed, or appear to have been particularly successful, and point to improvements which will benefit the whole group.
Loss event type
Classifying losses by business activity or product line is important. But the foundation of loss analysis is to be able to allocate losses to loss event types. The difficulty with loss event types is to have enough to be able to break down risk losses into sufficient granularity to be useful for effective and intelligent risk management, without disappearing into an unwieldy myriad of detail categories.
Loss event types are not a substitute for risk types. You need to be clear as to whether you are classifying risks or risk events. Risks are generally ‘failures to…’ or ‘poor…’. Events are the manifestation of a risk actually occurring, sometimes called the crystallisation of a risk.
It might be thought that the date of an event or loss is a fairly simple piece of data to record. However, it can be difficult, particularly if the event occurred several months before detection. This may be because a contract has a delayed start date or simply that the detective controls mitigating the risk are poor or non-existent. Often the only clear date is when the event is discovered. Some events occur over a period of time, in which case it is helpful to record the start and end dates. Conversely, when an event is first reported, it is often ongoing and it may be a number of months before it is closed and the loss established.
Where the ‘event’ is in fact a number of separate events linked by a single cause, such as the unauthorised trading undertaken by Nick Lesson at Barings and Jerome Kerviel at Société Générale, a single date may be inappropriate and a period may work better. In that case, though, it is important to consider the effect on estimates of likelihood, since dates are fundamental to this.
There are three dates that are typically captured for risk management purposes:
- The date of occurrence, which is when the event happened which may be a number of dates
- The date of discovery or detection, which again may be some time after the occurrence
- The date of closure, which is commonly taken to mean the date on which all actions relevant to the event are completed.
At a minimum, a brief description of the event should be given. However, some firms require event descriptions which can run to a page or more. While it is helpful to have all the information recorded, this may work against the speedy and timely reporting of events – or even their being reported at all. A well-run firm may have an absolute requirement for a brief description within, say, 24 hours of the event being detected, followed by a more detailed description when sufficient information is available.
Cause lies at the heart of risk management. It is not enough to know that an event occurred or nearly occurred. It is essential to understand why, so that remedial action can be taken. Reporting events and not causes means that they can be counted, but not managed. The cause of an event should form part of the detailed description of an event, although it is more helpful to report it separately. There is a danger, if cause, event and effect are not separately identified, for the loss event type to be used as a proxy for causal analysis. There is little business benefit by doing so as the point of causal analysis is to provide a consistent basis for assessing risk. But the point is often ignored in favour of an apparent quick win.
There are certainly benefits to be gained through a more accurate description of the cause of an event and allocating causes to generic causal categories. But the most important information in reports of loss events is to identify the controls which have failed. At least a primary control failure should be identified, although firms should identify secondary control failures as well. A single event is often the result of a number of control failures. Careful causal analysis will identify priorities to enhance or improve controls.
Root cause analysis
There are two main approaches to root cause analysis: ‘The 5 Whys’ analysis and ‘Bowtie’ analysis. The 5 Whys is exactly as it sounds. You ask the question ‘Why did it happen?’, and then ask the question ‘Why did that happen?’ and repeat approximately five times. It may be less than five and sometimes more than five in order to get to the root cause of the event. And yes, it is just like that inquisitive three-year-old who can’t stop asking why. Action is of course needed on the root cause if the event is beyond appetite. The actions may also generate ‘The 5 So whats’. In other words, does the proposed action really make a difference?
While the 5 Whys tends to focus on preventative controls, Bowtie analysis (also known at Butterfly analysis) covers all four types of controls.
For Bowtie analysis, the directive and preventative controls that have failed are identified. In addition, the detective and corrective controls are also examined to determine if the event had a larger impact than might have been the case because some of the controls mitigating the consequence of the event also failed.
Since a single cause can trigger a number of different risk events, linked risks can also be identified by recording causes, as well as any risk indicators which relate to the event. This will enable a holistic analysis of events to be easily undertaken, which will link together the three fundamental risk management processes or risk and control self-assessments, indicators and event causal analysis.
Amount of loss and recovery components
The impact of events can be divided into hard and soft as well as direct and indirect. The hard direct impact, where it exists, is always recorded. The other three categories may be recorded, depending on the relative sophistication of the causal analysis carried out by the firm. It should also be recognised that the final amount of the loss may not be known for a number of months and only estimates may be available when the event is first detected. Alternatively, a first actual monetary value may be available immediately, which may then require changing as the event progresses and more information becomes available.
In a firm which operates in a number of currencies, particularly where an event spans several countries, attention should be paid to the currency in which the event and its subsequent increments are reported.
One final issue is where a number of events, possibly relatively small in value, are linked by a single cause, but in aggregate amount to a significant figure. Do you choose one aggregate amount, or record each of the much smaller events, representing the individual control failures which occurred?
The answer depends on your identification of control failures, or combinations of control failures, but your decision will have a significant effect on your risk modelling. It may also affect any insurance recovery.
The actions recorded can be divided into two types: immediate actions and correct or improve actions. For example, when a laptop is lost, an immediate action will be to disable the laptop’s access to the firm’s network. This is typical of an immediate reaction to the detection of an event. Following causal analysis of the event, correct or improve actions may be:
- A staff note reminding staff to lock all laptops in the boot of their car when transporting them
- A redrafted policy regarding to whom laptops will be issued
- The purchase of encryption software to be installed on all laptops used within the firm.
There is a clear difference between the two types of actions: immediate damage limitation, followed by considered further action at amending and reinforcing controls or the implementation of additional controls.
It is helpful to allocate an owner to the loss so that there is clear responsibility for achieving the actions necessary to ensure that the event does not happen to the firm again.
Where a transaction or trade is involved, the unique transaction or trade number is recorded, together with any relevant client details. This is important if the event has a loss attached to it which may be passed back to the client.
Internal and external notifications may also be necessary: internally to the firm’s compliance, fraud or health and safety department, for instance; or externally, to a regulator or government authority, depending on the type of event that has occurred. And, of course, risk management must be notified if they are not already aware.
Timeliness of data
Event data degrade over time as the acceptable level of control environments and people’s perceptions of control environments change. For example, as IT environments change, manual controls will also change. Automated controls are also frequently updated as software improves. So, any analysis of loss event data must be careful to take the current environment into account.
In our next blog Tony and John discuss analysing risk events.
Mastering Risk Management by Tony Blunden and John Thirlwell is published by FT International. Order your copy here: https://www.pearson.com/en-gb/subject-catalog/p/mastering-risk-management/P200000003761/9781292331317
For more information contact us today on firstname.lastname@example.org