Post

Securing other peoples software - Part 2

I mentioned it in my previous post, but it’s worth mentioning again: as an IT professional – especially one working in the utility sector – there will come a time in your career where you’re given a piece of third-party software and asked to commission it for use within your organisation. You’ll need to ensure that this software is installed in a way that not only meets business requirements but does so without exposing the organisation to unnecessary (cyber) risk.

It’s an awesome responsibility, and an incredibly rewarding one. From an engineering standpoint, it’s incredibly satisfying to transform lifeless software (or hardware) into a living, breathing system – especially when it brings about tangible benefits for those involved.

In this post, I’ll walk through how my team and I went about installing a third-party software application used to manage a fleet of grid-connected battery energy storage systems (BESSs) – which are effectively big batteries that can be used to store or release power on command.

My goal is to share one way of securing such software, in the hope that it may offer some useful insights. Keep in mind, though, that your own context may require a very different approach.

Background

In the electrical distribution business, the product we sell is transportation. Our role – using poles, wires, and transformers – is to form part of a larger system that moves electricity from generators to consumers. It’s not enough to simply move power across our network; we must do so safely, reliably, and affordably. Equally important is ensuring that, as electricity flows across our network, we preserve its quality. That means minimising electrical noise, maintaining stable voltage levels to avoid damaging appliances or causing fires, and balancing power flows to prevent unnecessary stress on parts of our network.

To help ensure the quality of our ‘product’ remains high, my employer is deploying a fleet of small-scale Battery Energy Storage Systems (BESS) across the network. These units store or discharge energy as needed, acting as both a reservoir and a quality enhancer. Running around the clock, each BESS helps improve power quality by balancing power flows across all three phases of our electrcial distribution system, and to a limited extent, soaking up disturbances.

Like most modern technology, this fleet of BESS is supported by a central Battery Management System (BMS), developed by the BESS manufacturer. The BMS is designed for use by engineers and network operators to program, monitor, and control the BESS units – essentially functioning as a mini-SCADA system.

Because each BESS unit can directly influence both the quality of our ‘product’ and the safety of our customers, it is critical that the BMS software is operated only by individuals who are authorised, and suitably qualified. While the software is strategically important to the business, it is also a high-value target for malicious actors (a.k.a. hackers) seeking to cause disruption or harm.

Start by understanding the context

Securing any software, third-party or not, begins with understanding the context in which it will operate – namely, how the business will consume and interact with it.

The BMS software my team and I are faced with is a self-contained package that, once installed on a Windows computer, will deploy all software components onto that computer with no option to do otherwise. Once it’s up and running, the Battery Management System presents a web-based interface for end-users to interact with, allowing thse users to orchestrate the fleet of battery systems enrolled within. Out-of-the-box, the system looks a little like the following…

A picture representing the typical IT architecture of a battery management system

Users of the system will include senior electrical engineers, operations staff, and a handful of BI data analysts who are interested in the power quality metrics produced by each BESS. The business wants these users to access the Battery Management System via a web browser on their corporate computers.

However, allowing access from corporate computer systems – which are connected to the public internet and used for everyday tasks – raises important security concerns. Internet-facing systems like these are common entry points for malware and other undesirable cyber activity. In cybersecurity terms, end-user computers often serve as beachheads: the initial foothold that hackers use to move deeper into an organisation’s digital ecosystem. This deesire requires careful consideration.

The risks

When evaluating risk, I like to keep the FAIR model (Factor Analysis of Information Risk) in mind. One of the core principles I find particularly valuable is its emphasis on what is probable, over what is possibile. In the digital world, almost anything is possible, but not everything is probable. The FAIR model helps focus attention on the threats that are most likely, and therefore, most deserving of your effort.

In this respect, the risks below are those that have a reasonable probability over the lifetime of the asset.

Risk 1

An unauthorised or unqualified individual gains access to the Battery Management System and commands one or more BESS units to inject power into the distribution network at an inappropriate time – for example, when other distributed energy sources (e.g., solar PV systems) on the same low-voltage circuit are also exporting power. Cumulatively, the amount of power being exported could exceed the capacity of the upstream 11kV/415V distribution transformer, potentially leading to overheating and transformer failure.

While this scenario is plausible, several engineering checks must have been overlooked for it to become a reality. A single BESS unit, due to its relatively small capacity compared to the transformer, cannot independently cause an overload. However, if the LV circuit already has a high penetration of embedded generation, and those systems are generating concurrently, the combined export could push the transformer beyond its rated maximum thermal operating limit for an extended period of time.

To address this risk, we can implement a handful of controls:

  • Transformer Sizing: Ensure the upstream distribution transformer is appropriately sized not only *for the after diversity maximum demand (ADMD), but also for the maximum expected export from both BESS units and distributed generation on the circuit.
  • Protective Devices: We validate proper fusing and circuit protection on both the BESS units and the transformer. These safeguards are designed to disconnect the system automatically before thermal or electrical stress causes permanent damage.
  • Monitoring Devices: We install independent LV monitoring technology on the upstream distribution transformer. This monitoring system provides an independent warning that the overall power system (at that transformer) is not operating within expected thresholds.

Risk 2

Like risk #1, an unauthorised or unqualified individual gains access to the Battery Management System and commands one or more BESS units to consume (store) power from the distribution network at an inappropriate time – for example, when other consumer loads are high during peak demand. Cumulatively, the amount of power being drawn through the upstream 11kV/415V distribution transformer could be higher than the transformer’s nominal rating, potentially leading to overheating and transformer failure.

The controls for this risk are like risk #1 – good engineering design, fusing, and independent monitoring.

Risk 3

An unauthorised or unqualified individual gains access to the Battery Management System and commands one or more BESS units to inject a high amount of kVAr’s at an inopportune time, pushing the system voltage extraordinarily high and potentially damaging network equipment and the equipment within our customers’ homes.

The controls for this risk are like risks #1 and #2 above, with the addition that we select BESS units equipped with built-in voltage-dependent resistors (VDRs) or similar protective components. These devices automatically bleed excess voltage to ground, reducing the potential harm.

Risk 4

An unauthorised or incompetent actor gains access to the Battery Management System and pushes either custom or corrupt firmware to the fleet of BESS units. This action could render the devices inoperable – either digitally (by preventing communication or control) or physically (by damaging internal components or disabling startup processes).

This type of incident, commonly referred to as “bricking”, occurs when a device receives firmware so flawed that it can no longer boot or function. While hackers could initiate such an attack, it is often more likely to be the result of an internal error, such as a well-intentioned but improperly executed firmware update by an authorised employee.

This is where we start to limit privileges to users within the system, such that only a select few can undertake firmware updates. Additionally, for those users who can push firmware updates, they have sufficient training, and an understanding of our organisations change control processes, such that they (for example) would test updates on a single, low priority BESS unit first, followed by gradually scaling the update to larger subsets of the fleet (e.g. 5 units, then 10, then 30).

Risk 5

An unauthorised actor (i.e. a hacker) discovers a vulnerability in the web-facing components of the Battery Management System. This flaw allows them to run arbitrary code of their choosing on the underlying computer system that runs the BMS software (i.e. the hacker has found a RCE flaw). Once inside, the hacker uses this compromised computer system as a beachhead to dig deeper into our organisation’s IT network. From there, the hacker may choose to either launch attacks against our other internal IT systems or attempt to compromise those of our partners.

This scenario reflects a classic intrusion pattern in cybersecurity, where an externally accessible system becomes the initial entry point into an organisation’s digital environment. While such an attack may be complex, the likelihood is high given the design of the software. Furthermore, due to the nature of our organisation, we are an attractive target for well-resourced hackers, including those with significant time, funding, and capability.

Let’s home in on how we mitigate this risk in the next section.

Our IT Security Controls

A picture representing the IT architecture of a well-protected battery management system

Isolate the system

Ordinarily, for software of this nature, I would aim to separate the web-facing components from the backend logic and data storage. This architectural separation allows us to place strict firewall controls between each layer, helping to slow down any attacker who compromises the web server, and giving us time to detect, respond, and contain.

However, in this case, the software does not support the separation of its internal components. As a result, our next best option is to apply logical isolation at the host level using highly restrictive firewall rules.

The BMS host machine is placed in a dedicated VLAN with no access to unrelated systems. It can only communicate with explicitly approved endpoints, such as the BESS units, and is barred from reaching the broader corporate network or the Internet.

This approach significantly limits a hacker’s ability to move laterally within our IT network, should they manage to gain remote access.

Place the web interface behind a captive portal, and limit access

Because the Battery Management System exposes a web interface to end-users, we have a couple of options around how we can control access to this interface. Exposing it directly to the public internet is not an option – the risk is far too high for a system of this sensitivity. Instead, we could restrict access via our corporate VPN, which keeps the BMS interface entirely off the public internet, or we could place it behind a captive portal, which can allow controlled access without requiring full VPN access.

In this case, we have elected to place the BMS web interface behind Microsoft’s authentication portal. This approach is particularly suitable because it allows occasional access by non-organisational users (such as the BESS vendor) without the need to grant them VPN credentials, which would introduce additional risks to the wider corporate network.

By leveraging Microsoft’s authentication platform, we gain several key adjacent benefits:

  • Strong Multi-Factor Authentication (MFA) is enforced before access is granted.
  • Device posture checks are performed to ensure that the connecting computer is up to date and running an approved anti-malware package.

Limit privilege for the access that is provided

For all user accounts with access to the Battery Management System (BMS), we apply the principle of least privilege, ensuring each user is granted only the minimum level of access required to perform their role.

This isn’t because we don’t trust our people. No, it’s a proactive safeguard. Limiting privileges helps reduce the risk of users accidentally performing actions beyond their expertise, and it minimises the potential damage if their user account is compromised and used for unauthorised activity.

Separate authentication domains

Authentication for the Battery Management System is deliberately separated from our corporate identity management system. While users may initially authenticate via Microsoft’s captive portal using their corporate credentials, these same credentials do not provide direct access to the underlying BMS interface.

This separation is intentional. Corporate credentials – used to access email, Microsoft Office applications, and other internal systems – are high-value targets for phishing and are more likely to be compromised. By maintaining a distinct authentication domain for the BMS, we reduce the risk of a compromised corporate account being used to gain unauthorised access to critical infrastructure.

While this approach introduces a slight trade-off between security and user convenience, it aligns with our organisation’s broader identity and access management strategy. As such, the additional cognitive load for users is minimal and well justified.

Install allow-listing and EDR tooling

As a baseline security control, all computer systems – including those running the BMS software and end-user devices – are equipped with Endpoint Detection and Response (EDR) tooling. These advanced solutions go beyond traditional antivirus by detecting and responding to exploits, malware, and suspicious behaviour in real time.

In addition, we deploy application allow-listing software on many of our IT systems. This ensures that only pre-approved, trusted applications can execute, preventing unknown or unauthorised code from running.

For high-sensitivity systems like the BMS computer, these controls are configured with the highest possible security settings, significantly reducing the likelihood that malicious software can execute without prevention and/or triggering an alert for investigation.

Ensure monitoring, logging, and anomaly detection is in place

Complementing our core IT, computing, and networking infrastructure is a suite of monitoring tools that continuously collect and analyse system logs. These tools – sometimes referred to as Intrusion Detection and Prevention Systems – establish a baseline of “normal” behaviour across our digital environment and actively monitor for deviations.

By ingesting real-time data from a variety of sources, these systems are capable of detecting anomalous activity across our IT networks, computer systems, and user accounts. With the integration of modern machine learning algorithms, they provide around-the-clock monitoring, alerting, and response, helping to limit the impact of any successful intrusion.

Ensure physical protections are in place.

Because our Battery Energy Storage Systems (BESS) include field-deployed components – installed in publicly accessible locations – we take deliberate steps to physically secure this equipment and detect signs of tampering.

Physical protections are designed to prevent interference with the BESS units and the IT equipment housed within. We also implement detection and response mechanisms that trigger alerts when specific physical events occur.

The physical security of our digitally connected assets is an increasingly important aspect of our overall cybersecurity posture, especially as more operational technology becomes network-connected.

SOPs, training, and policy

Last, but not least, all of the above means nothing if we have not taken the time to establish robust standard operating procedures, training modules, policies, and competency requirements. Not just for the use of the battery management system itself, but to monitor, operate, and maintain the raft of cybersecurity controls that bring each risk to within tolerable levels.

Without this human-centric layer, which is arguably the most important layer, we risk exposing ourselves and our customers to incidents that no one wants to experience.

Good risk management starts with you

If there’s a single takeaway, it’s this: risk management is a core part of an IT/OT professional’s role. It cannot be done in isolation, nor is it confined to the cyber realm. As systems become increasingly interconnected and business users demand greater flexibility, we need to take a holistic view of the risk landscape. Think through plausible scenarios, then apply the full range of mitigations in your toolbox to keep risk within acceptable limits. Our customers, our colleagues, and in some cases, our society, depends on it.

This post is licensed under CC BY 4.0 by the author.