XML encoder utility
XMLEncoder is a utility class for XML entity encoding. This utility can be used to escape parts of the response that may have untrusted active content.
Description
A Web site might inadvertently include malicious HTML tags or scripts in a dynamically generated page, based on input that has not been validated, from untrustworthy sources. By accessing a malicious URL and then accessing an application server, a user may unknowingly execute script code on his machine that has full access to the data and resources on that machine. The browser executes the script on the user machine without the knowledge of the user. The malicious tags that can be embedded in this way are <script> and </script>. This problem can be prevented if the server generated pages are encoded to prevent the scripts from executing. Developers generating responses containing client data, based on http requests, can encode the response data using the following static method: zero.util.XMLEncoder.escapeXML(String s)
PHP developers may use htmlentities() to convert characters to HTML entities.
Example
The typical use case is malicious request parameters sent by the client to the server using a url constructed like http://localhost:8080/test.groovy?param1=<script>alert('message')</script> . In this example, the request parameter param1 contains a JavaScript alert function call which if executed would display the word message in a dialog window on the client browser. To prevent this, a developer could use the IBM® WebSphere® sMash method XMLEncoder.escapeXML(String s) .
Following is an example groovy code snippet that reads the incoming request parameter 'param1', escapes the malicious content and prints it.
def paramValue = request.params.param1[] def paramValueEscaped = zero.util.XMLEncoder.escapeXML(paramValue); print paramValueEscaped;
View the page source in the browser to see the result of the escapeXML, as shown in the following screen capture
and the following being displayed in the browser:
Since the script tag is escaped, the browser does not interpret the HTML entities. The result is that the content is rendered instead of the script dynamically executing on the client.