XML External Entity (XXE) Attack

XML external entity (XXE) attack is a simple-to-execute vulnerability attack that often occurs due to negligence in the configuration of the XML parser that could have catastrophic results.

XML external entity attack (XXE)

This attack is mainly used to compromise the important files on the server or from the other users. For example, this attack can access the files at the path /etc/shadow containing the important credentials in a Unix-based server.

A unique annotation in the XML specification for importing external files makes an XXE assault extraordinary. The computer used to assess the XML file interprets this distinctive directive, which is referred to as an external entity. This implies that compromised files within a server’s file structure may arise from a tailored XML payload supplied to the server’s XML parser.

XXE occurs in the server request that accepts XML or XML-like payload, while in modern web development where we often use JSON payload to exchange data, you may question where we use XML-like payload.

Formats like SVG, HTML/DOM, PDF (XFDF), and RTF are similar to XML. These XML-like formats are also accepted as inputs by many XML parsers since they have several similarities with the XML spec.

The XXE attacks can be classified into two parts.

Direct XML external entity (XXE)

In the direct XXE, the attacker tries to exploit the endpoint that accepts XML-like objects with the external entity flag.

The XML server is not configured to avoid the execution external entity flag. The payload is parsed and the returned results hold the external entity that could be sensitive information that can compromise the whole server.

Let us evaluate a scenario where XXE can take place on our website https://my-web-hosting.com

Assume that our web hosting website has allowed the user to take a screenshot of the page directly and send it to customer support.

For this we have added a button on the page.

<button id="capture-the-screen">Capture the screen</button>

And then we have this JavaScript function that captures the screenshot.

const screenshot = async () => {
  try{
    
	//Get the whole page and convert it to XML
	const page = document.getElementById('main').innerHTML;
	const serializer = new XMLSerializer();
	const pageAsString = serializer.serializeToString(page);
    
	//request to convert DOM to image
	let response = await fetch('https://my-web-hosting.com/screenshot', {
  	  method: "POST",
  	  body: pageAsString
	});
	response = await response.json();
	const image = response;
  }catch(e){
	console.error(e);
  }
}

document.getElementById("capture-the-screen").addEventListener('click', screenshot);

Let us understand the screenshot function, this function takes a DOM element, the main HTML tag, and serializes it to the string, for serialization, we are using an XML parser as the HTML follows the XML-like structure.

When the request is made, the server that is handling the request is under the assumption that what will be passed in the payload will always be an HTML structure that has to be converted to the image.

import xmltojpg from './xmltojpg';

// convert the XML to JPEG image and return it in the response
app.post('/screenshot', function(req, res) {
 // if the body is empty return
 if (!req.body) { return res.sendStatus(400); }

 // convert the XML to an image
 xmltojpg.convert(req.body)
 .then((err, jpg) => {
   if (err) { return res.sendStatus(400); }
   return res.send(jpg);
 });
});

Thus it has not configured the parser to restrict the execution of the external entity flag in the method xmltojpg.

An attacker finds this vulnerability and sends a custom payload.

const pageAsString = `<!ENTITY xxe SYSTEM "file:///etc/passwd" >]><xxe>&xxe;</xxe>`;

The parser will pick this string and parse and convert it to the image, as the parser is not configured to restrict the execution of external entities, the image will contain the content of the file mentioned in the string.

If the attacker gets hands on such sensitive data it could result in devastating outcomes as the whole server can be compromised.

Indirect XML external entity (XXE)

In the indirect XXE attack, the hacker tries to exploit the request endpoints that do not accept XML or XML-like objects as input but internally uses a different service that does XML parsing resulting in the compromisation of sensitive information.

Assume, your service uses a third-party payment service internally to handle the payment on your website that is hosted on your premises.

Now this payment service accepts XML/SOAP in the request payload as the majority of old age services do, thus to pass the data to this service, you convert them into XML format.

This service has not been configured to restrict the execution of external entity flags and the attacker has passed an pageAsString that will be parsed on the server, as this service is hosted on your premises it will parse the XML and will return all the details from the files.

It can be a major threat and because you are using a third-party application there is often a case that your system is robust but it can have vulnerabilities.

After all the attacker exploits the loopholes that go unnoticed by the engineers in the organizations.

Thus it is extremely important to do a vulnerability scan of all the third-party applications before hosting them on our premises.

Summary

The XML External Entity attack is something that has to be handled at both the client and server end. This simple-looking attack can take down the whole server.