urlscan.io API v1


Our API allows you to submit URLs for scanning and retrieve the results once the scan has finished. Furthermore, you can use the API for searching existings scans by attributes such as domains, IPs, Autonomous System (AS) numbers, hashes, etc. Searching and retrieving results through the API can be done anonymously, i.e. without an API key. To submit URLs for scanning, you will need to create a user account, attach an API key and supply it when submitting a scan.

Submissions via the API can have a visibility of public or private. Public scans will show up on our front page and in the search API, private scans can only be retrieved with the correct scan ID. We prefer public submissions so other users can benefit from the results of your scans.

To get started, you can check out one of the existing tools and integrations for urlscan.io


Submission API

Once you get your API key, you can submit pages like this:

curl -X POST "https://urlscan.io/api/v1/scan/" \
      -H "Content-Type: application/json" \
      -H "API-Key: $apikey" \
      -d "{\"url\": \"$url\", \"public\": \"on\"}"
If you don't want to submit your scan as public then just remove the public attribute from the request body completely. The downside is that private scans can not be found via the search or search API!
The response to the API call will give you the scan ID and API endpoint for the scan, you can use it to retrieve the result after waiting for a short while. Until the scan is finished, the URL will respond with a HTTP 404 status code.

Other options that can be set in the POST data JSON object:

  • customagent: Override User-Agent for this scan
  • referer: Override HTTP referer for this scan
  • public: Omit this attribute to submit as private scan

If you have a list of URLs, you can use the following code-snippet to submit all of them:

echo list|tr -d "\r"|while read url; do;
  curl -X POST "https://urlscan.io/api/v1/scan/" \
      -H "Content-Type: application/json" \
      -H "API-Key: $apikey" \
      -d "{\"url\": \"$url\", \"public\": \"on\"}"
  sleep 2;
done


Result API

You can get the result of a scan using the scan ID:

curl https://urlscan.io/api/v1/result/$uuid/
Until a scan is finished, this URL will respond with a HTTP 404 status code.

Once the scan is in our database, the URL will return a JSON object with these top-level properties:

task
Information about the submission: Time, method, options, links to screenshot/DOM
page
High-level information about the page: Geolocation, IP, PTR
lists
Lists of domains, IPs, URLs, ASNs, servers, hashes
data
All of the requests/responses, links, cookies, messages
meta
Processor output: ASN, GeoIP, AdBlock, Google Safe Browsing
stats
Computed stats (by type, protocol, IP, etc.)

Some of the information is contained in duplicate form for convenience.

In a similar fashion, you can get the DOM and screenshot for a scan using these URLs:

curl https://urlscan.io/screenshots/$uuid.png
curl https://urlscan.io/dom/$uuid/


Search API

You can use the same ElasticSearch syntax to search for scans as on the Search page. Each result contains a little bit of high-level information and a link to the API for the full scan result.

curl "https://urlscan.io/api/v1/search/?q=domain:urlscan.io"
Available query parameters for the search endpoint:

q
The query term (ElasticSearch simple query string). Default: "*"
size
Number of results returned. Default: 100
offset
Offset of first result (for paginating). Default: 0
sort
Sorting, specificied via $sort_field:$sort_order. Default: _score

API search can only find public scans, there is no way to search for private scans.


Rate Limiting

We have basic rate limiting in place to catch runaway scripts and keep a check on the load on our system. When bulk-submitting URLs, make sure you wait 2 seconds between consecutive requests. If you exceed the limit, the server will respond with a HTTP 429 error code.

There is no rate limiting for the Search API and the Result API.


Error Codes

There might be various reasons why a scan request won't be accepted and will return an error code. This includes, among others:

  • Spammy submissions of URLs known to be used only for spamming this service.
  • Invalid hostnames or invalid protocol schemes (FTP etc).
  • Missing URL property ... yes, it does happen.
  • Contains HTTP basic auth information ... yes, that happens as well.
  • Non-resolvable hostnames (A, AAAA, CNAME) which we will not even try to scan.

An error will typically be indicated by the HTTP 400 status code. It might look like this:

{
  "message": "DNS Error - Could not resolve domain",
  "description": "The domain .google.com could not be resolved to a valid IPv4/IPv6 address. We won't try to load it in the browser.",
  "status": 400
}

If you think an error is incorrect, let us know via mail!


Integrations & Tools

A few companies and individuals have integrated urlscan.io into their tools and workflows.
If you'd like to see your tool listed here, send us an email!

Commercial
  • Demisto Enterprise - Incident Lifecycle Platform
  • Phantom - Security Automation & Orchestration Platform
  • Siemplify - Security Orchestration, Automation and Incident Response
  • Swimlane - Security Orchestration, Automation and Response
  • IBM Resilient - IBM Resilient Incident Response Platform
Open Source

Manual & Automatic Submissions

Manual submissions are the submissions through our website. No registration is required. Manual submissions have the same features as API and automatic submissions.

Automatic Submissions are URLs we collect from a variety of sources and submit to urlscan.io internally. The reason behind this to provide a good coverage of well-known URLs, especially with a focus towards potentially malicious sites. This helps when scanning a new site and searching for one of the many features (domain, IP, ASN) that can be extracted. Automatic Sources

  • OpenPhish AI-powered list of potential phishing sites: OpenPhish
  • PhishTank List of potential phishing sites: PhishTank
  • CertStream Suspicious domains flying throught CertStream
  • Twitter URLs being tweeted / pasted by various Twitter users.