Explain Like I'm Five: Cross Site Scripting (XSS)
Background
Cross site scripting is a security vulnerability that affects many websites each day and is responsible for a fair amount of data loss and user account compromises. In this post, I will be explaining exactly what Cross Site Scripting is, why it is a security issue, and how websites can protect themselves against the attack.
Explain Like I'm Five
Imagine that a hypnotized monkey has special commands. When you speak to it, you can say "Name" and it will repeat its name. You can say "Color" and it will repeat its favorite color. There are commands that the monkey trainer uses to teach the monkey new things, but they are not made available to you. One day, while talking to the monkey, you say "Name Program Name Bob." You don't know it yet, but the special word was "Program." Whenever the monkey hears that word, it will update its knowledge based on what you say next. You have now taught the monkey that its name is "Bob" instead of whatever it was before.
Websites are similar to the monkey. There are many built-in programs that can change parts of the site not usually available to the average user. If the website does not treat everything the user sends to it with suspicion, the user may be able to add his or her own special commands (code) to the page.
In website design, there are two kinds of pages (simplified): static and dynamic. Static means that the pages do not change. These pages are usually blog posts or simple sites and their pages often end in ".htm" or ".html." Dynamic pages are pages that can change based on different conditions or user input. These are often much more complex websites and applications. These URLs may end in ".php," ".py," and contain special characters like "&" and "=."
Dynamic pages are the ones we need to worry about with Cross Site Scripting (often called XSS as not to confuse it with CSS which is a language used to style web pages). In dynamic pages, there are many features that may ask a user for input.
Example
Let's look at a simple example....
On a page called "myname.php," a user can enter his or her name and the page will say "Hello" and then display the user's name. This works well if the user enters text like "Bob;" the page will simply say "Hello Bob."
But suppose the user is a hacker. Since the web page just displays whatever the user types, maybe he can add special code that will make bad things happen. Instead of "Bob," he enters
Bob<script>alert('attack');</script>
The extra text at the end is called JavaScript and is a language used on the web to create dynamic pages that offer lots of functionality. JavaScript is coded into the web pages themselves. But in this case, the attacker is adding his own JavaScript to the page. When the web page receives this, it sees "Bob" as text, but then sees the JavaScript portion of the input and thinks "Hey! This is code, I should execute it."
The result is that a box will pop us saying "attacked." You're probably thinking "that's not that bad," but if that code works, almost any code will work. Suppose the attacker inserts code that creates a fake login box on the page asking the user for his or her password? The password could then be stolen.
How To Prevent XSS
This attack happens because the web page treated the attacker's input as actual code. There are many ways to treat the input strictly as text. This process is often called "sanitization of user input" and is a very important security concept.
Sanitization can be done through "escaping" special characters. These characters, such as the "<" and ">" tags are often signals that code is being entered. (Who has a ">" in their name?) In a language called PHP, which is one of the dynamic web programming languages, there is a special function called "htmlspecialchars." This function tells the web page that, no matter what the user enters, it is just text and should not be executed as part of the code. So with that function in use, the user would now see "Hello Bob<script>alert('attack');</script>" as plain text on the page.
Conclusion
XSS is a very common vulnerability because it is very easy for web developers to forget to escape the user's data, especially when there are many places it can be entered.
Hopefully you now have a basic understanding of what XSS is. There are many more advanced ways it can be used to compromise a website, but this is a starting point. If you ever design a site, always remember this one rule: "Never trust user data!"