Deep dive into httpx

An in-depth guide to using HTTPX for advanced HTTP probing and analysis.

Nikola Jokic

10/21/2025

httpx, tools

Getting started with HTTPX

httpx is a fast and multi-purpose HTTP toolkit built to support running multiple probes using a public library. Probes are specific tests or checks to gather information about web servers, URLs, or other HTTP elements. Httpx is designed to maintain result reliability with an increased number of threads.

If you want to learn more about httpx from the official documentation, check out the httpx Overview

# Why httpx?

Httpx stands out because of few very important reasons:

Customizations: You can customize every important detail related to the HTTP requests.
Edge case handling: Automatically handles retries, fallbacks from HTTPS to HTTP, ...
Composability: Every tool from ProjectDiscovery has output formats that are easy to compose with other tools.

Let's break it down one by one.

# Why is customization important?

Since the httpx tool is categorized as active scanning, bug bounty programs or security assessments sometimes require you to specify some headers, authorization credentials, etc. These rules are very important to follow because they are one of the ways the company knows how to treat this traffic.

But that is not the oly reason this is important. Some websites are serving different content based on the user agent, or other header. During reconnaissance, it is very important to understand everything the back-end does, in order to widen the attack surface.

The wider the surface is, the better are chances of you finding the bug!

So play around with different methods, maybe probe all IPs and see if something is different in the response. Maybe one of the servers is deployed with the new patch that is vulnerable, while others are deployed using the old and well-tested code.

Another interesting feature I love to use is specifying a proxy. I can route the traffic through my intercepting proxy, so my all of the targets are populated in the proxy history. Then I can look at the requests, and probe them directly.

# Why is edge case handling important?

From the time you make a request, until the server receives it, there are many hoops that the request might be touching. Every hop is a potential place where the request might fail. So ideally, if the request fails with the retryable failure, you want to try it again.

But it is not that simple. If you retry always right away, you would create a lot of noice, which can be percieved by WAFs/rate-limiters as the malicious traffic. Retries with backoffs is a standard practice in web development, which means it should be a standard practice for security testing. And you get this out of the box!

# Composability

As with every tool from ProjectDiscovery, you can easily use the output of httpx as the input of other tool in your pipeline. This is incredibly important for automation.

You can specify the level of details each probe shows, filter out unwanted results or only include the desired results, or extract the data you need.

# Installation & Setup

# Install with Go

Probably the best and the easiest way to install httpx is using Go modules.

For other installation methods, check out the official installation guide.

To install httpx, run:

bash

I would advise you to have go installed on your machine, and have your PATH already set up to include the Go binaries.

You can always update the tool by using the same command, or simply running:

bash

# Diving deeper into httpx

Now, let's discuss some of the most important features of httpx.

If you run httpx -h, you will see a lot of options. Let's go through some of the most important ones.

Let's break it down one by one, talk about what they do, and how to use them.

As of writing this blog post, httpx supports following probes:

bash

# Input

text

# `-l, -list`

I mostly prefer the -l option, where I can specify a file that contains all of the hosts I want to probe.

Usually, inside the workflow, I would use it like this:

yaml

There are few reasons why I prefer this approach:

I saved the workflow template using the variable, so I can re-use the template across different projects, and only modify the project variable.
I like to have variables written in form of a line-per-domain, istead of comma-separated values.
It is cleaner if you want to add more domains to the scope later on, or you want to log the scope in the workflow

# `-rr, -request`

This option is very useful when you want to test endpoints that require specific headers or cookies for authentication.

Now, you can leverage this option by using BountyHub's storage. You can save the raw request file in the storage, and download it as part of the workflow. Then, you can use it with the -rr option. It might be an overkill for some cases, but for others, it is a life-saver.

It is likely best if you use nuclei for this testing, but it is still a useful option to have.

# `-u, -target`

This option is useful when you want to quickly test a few hosts without creating a file. I would still advise you to use the -l option for better re-usability and cleaner workflows. But you can save a variable in the project that contains comma-separated values, and use it like this:

yaml

# Probes

Probes are the core of httpx. They provide the information you need about the target hosts, that you can use for further testing.

text

# `-sc, -status-code`

This option is very useful when you want to filter out the results based on the status code. For example, it would be very useful to only show the results that are 200 OK, or 403 Forbidden.

# `-cl, -content-length`

I prefer to include this option in order to see the size of the response. It is very useful when you want to identify when the response size is changed. If the new version of the website is deployed, the content length might change. I'd like to be notified when that happens.

# `-ct, -content-type`

This option specifies the content type of the response. It is very useful when you want to identify the type of the response. Let's say that some endpoint returns application/json instead of text/html. That might be an interesting to test that further as the API, instead of the web page.

# `-location`

This option shows the redirect location of the response. It is very useful when you want to identify the redirect chains.

# `-favicon`

This option shows the mmh3 hash of the favicon. It is very useful when you want to identify the same well known applications such as jenkins, or other applications that have the same favicon across different installations.

People tend to forget to change the favicon, so this is a great way to identify them.

# `-hash string`

This option shows the hash of the response body. It is very useful when you want to identify the same responses across different hosts. It is even more useful when you combine it with the -cl option, so you can see the size and the hash of the response.

However, I prefer keeping both the content-length and the hash, because I can clearly see if the content-length is 0, meaning there is no body, and I know it right away without computing the hash. Otherwise, I would need to know the hash of the empty body to be able to identify it.

# `-jarm`

This option shows the jarm fingerprint hash of the server. It is very useful when you want to identify the same server across different hosts. JARM is a TLS server fingerprinting tool that helps identify servers based on their TLS configurations.

It is very useful when you want to identify the same server software, or the same server version across different hosts.

# `-rt, -response-time`

This option shows the response time of the server. It is very useful when you want to identify the slow servers. Maybe, the server is under heavy load, or there is a network issue causing the delay. That is not very useful. But if the server is deployed as a serverless function, and it has a high response time, that might be an interesting to know more about it.

Or maybe, the server is deployed during the development phase, using not so optimized code. It might include debug logs, verbose error handling, etc.

# `-lc, -line-count` and `-wc, -word-count`

These options show the line count and word count of the response body, respectively. They are very useful when you want to analyze the size and complexity of the response content.

For example, if the line count or word count increases significantly, it might indicate that the response has changed, and you may want to investigate further.

You can use this option instead of content-length, or in addition to it.

# `-title`

This option shows the page title of the response. It is very useful when you want to identify the purpose of the page quickly.

You should also know when the title changes, as it might indicate a significant change in the content or purpose of the page.

# `-bp, -body-preview`

This option shows the first N characters of the response body. It is very useful when you want to get a quick glance at the content of the page without downloading the entire body.

# `-server, -web-server`

This option shows the server name of the response. It is very useful when you want to identify the server software being used. For example, you can identify if the server is running Apache, Nginx, IIS, etc.

This information can help you identify potential vulnerabilities or misconfigurations associated with specific server software.

# `-td, -tech-detect`

This option shows the technology stack used by the server. It is very useful when you want to identify the frameworks, libraries, or other technologies being used by the application. For example, you can identify if the application is built using React, Angular, Django, etc.

This information can help you identify potential vulnerabilities or misconfigurations associated with specific technologies.

# `-method`

This option shows the HTTP request method used. It is useful when you want to test multiple methods on the same endpoint.

# `-websocket`

This option indicates if the server supports WebSocket connections. It is useful when you want to identify real-time communication capabilities of the application.

# `-ip`

This option shows the IP address of the host. It is useful when you want to identify the server's location or network. Let's say that the IP address is changed. That might indicate that the server is migrated to a different hosting provider, or a different region.

Or, maybe the server is deployed using the rolling deployment strategy, and the IP address is changed as part of the deployment process. This indicates that the new server is live, and you can start testing it.

# `-cname`

This option shows the CNAME record of the host. It is useful when you want to identify the domain name associated with the IP address.

# `-extract-fqdn, -efqdn`

This option extracts domain and subdomains from the response body and headers. It is useful when you want to discover additional subdomains that might be linked from the main page.

# `-asn`

This option shows the Autonomous System Number (ASN) information of the host. It is useful when you want to identify the network or organization that owns the IP address.

Then, you can use this information to identify potential targets within the same network or organization.

# `-cdn`

This option shows the CDN or WAF in use by the host. It is useful when you want to identify the content delivery network or web application firewall protecting the application.

It is very useful to know this information, as it can help you identify potential security measures in place that might affect your testing.

# `-probe`

This option shows the probe status of the host. It is useful when you want to identify if the host is reachable or not. If the host is not reachable, you can skip further testing on that host.

However, you should still consider testing unreachable hosts later on, as they might become reachable in the future, or you can target them in case you find the SSRF, for example. Then you know right away which hosts you can target.

# Headless

Headless browsing is a powerful feature that allows you to interact with web pages as if you were using a real browser, but without the graphical interface.

You can use these options in your automation workflows to capture screenshots of web pages, which can be useful for visual verification, monitoring changes, or identifying potential security issues.

text

# `-ss, -screenshot`

This option enables saving a screenshot of the page using a headless browser. It is useful when you want to capture the visual representation of the web page for further analysis or reporting.

You don't need to visit the page manually, as the headless browser will render the page and capture the screenshot for you, and you can save it to a file or process it further.

I tend to use this option if the scope is large. I run the second step on live domains found with previous execution of httpx. Then, I download the screenshots, and quickly glance through them to identify interesting targets.

# `-system-chrome`

This option enables using the locally installed Chrome browser for taking screenshots. It is useful when you want to leverage the capabilities of your existing Chrome installation, such as extensions or specific settings.

# `-ho, -headless-options string[]`

This option allows you to start the headless Chrome browser with additional options. It is useful when you want to customize the behavior of the headless browser, such as setting specific flags or configurations. Sometimes, you might want to disable certain features, or enable specific ones to better mimic a real user browsing experience.

# `-esb, -exclude-screenshot-bytes`

This option enables excluding screenshot bytes from the JSON output. It is useful when you want to reduce the size of the output, especially when dealing with a large number of screenshots.

# `-ehb, -exclude-headless-body`

This option enables excluding the headless header from the JSON output. It is useful when you want to reduce the size of the output, especially when dealing with a large number of requests.

# `-no-screenshot-full-page`

This option disables saving full-page screenshots. It is useful when you want to capture only the visible portion of the page, which can help reduce the size of the screenshot files.

You should probably use this option when you are testing a large number of pages, and you want to save storage space. From the first glance, you can identify if the page is interesting or not, without needing the full-page screenshot.

# `-st, -screenshot-timeout value`

This option sets the timeout for taking a screenshot in seconds. It is useful when you want to ensure that the screenshot process does not hang indefinitely. You can adjust the timeout based on the expected load time of the pages you are testing.

# `-sid, -screenshot-idle value`

This option sets the idle time before taking a screenshot in seconds. It is useful when you want to ensure that the page has fully loaded and any dynamic content has been rendered before capturing the screenshot.

Single page applications (SPAs) often require some time to load all the content, so setting an appropriate idle time can help ensure that you capture the complete state of the page.

# Matchers

Matchers allow you to filter the results based on specific positive criteria. This means that only the results that match the specified criteria will be included in the output.

Since matchers and filters are very similar, thin of it as the opposite of filters. While filters exclude results based on negative criteria, matchers include results based on positive criteria.

Let's say, you want only want results that return status code 200. You should use matchers for that.

If you don't want to see results returning status code 404, you should use filters for that.

Let's see the available matchers:

text

Personal note: I prefer not to use matchers during initial reconnaissance, because I want to see everything that is out there. I usually use matchers for the second pass, when I want to re-run the httpx (just in case) on the live hosts and have automated tests that would target only specific responses.

# `-mc, -match-code string`

This option allows you to match responses with specified status codes. For example, you can use -mc 200,201,202,203,204,300,301,302,303,401,403 to include only responses with those status codes.

# `-ml, -match-length string`

This option allows you to match responses with specified content lengths.

For example, you can use -ml 100,200,300 to include only responses with those content lengths. This is only useful if you know the exact content lengths you want to match.

# `-mlc, -match-line-count string`

This option allows you to match responses with specified line counts in the response body. Use this only if you know the exact line counts you want to match.

# `-mwc, -match-word-count string`

This option allows you to match responses with specified word counts in the response body. Use this only if you know the exact word counts you want to match.

# `-mfc, -match-favicon string[]`

This option allows you to match responses with specified favicon hashes.

Let's say you want to test wide attack surface of Jenkins installations. You can use the known favicon hash to identify all Jenkins installations, and then test them further.

# `-ms, -match-string string[]`

This option allows you to match responses with specified strings in the response body. Use this only if you know the exact strings you want to match.

# `-mr, -match-regex string[]`

This option allows you to match responses with specified regular expressions in the response body. Use this only if you know the exact regex patterns you want to match.

# `-mcdn, -match-cdn string[]`

This option allows you to match hosts with specified CDN providers. For example, you can use -mcdn cloudfront,fastly to include only hosts using those CDN providers.

# `-mrt, -match-response-time string`

This option allows you to match responses with specified response times. For example, you can use -mrt '< 1' to include only responses with response times less than 1 second.

This is useful when you want to identify fast-responding hosts, which might indicate well-optimized applications. Inverse is also true, you might want to identify slow-responding hosts, which might indicate under-optimized applications, or serverless functions that are cold-starting.

# `-mdc, -match-condition string`

This option allows you to match responses based on DSL expression conditions. This is a powerful feature that allows you to create complex matching conditions using a domain-specific language (DSL).

Use -ldv option to see more about the DSL syntax and available functions.

For example, you can use -mdc "status_code == 200 && content_length > 1000" to include only responses with status code 200 and content length greater than 1000 bytes.

Use this to create complex matching conditions that combine multiple criteria.

# Extractors

Extractors allow you to extract content from the response content.

I personally am not using extractors that much, but they can be very useful if you want to include and extract instances of interesting content from the response.

text

# `-er, -extract-regex string[]`

Let's say we would like to extract with every response URLs that are present in the response body.

I usually use this option with simple regex patterns, such as:

bash

This would extract all URLs present in the response body.

# `-ep, -extract-preset string[]`

This option allows you to extract content from the response using pre-defined regex patterns. Available presets are url, ipv4, and mail.

I tend to only use ipv4, but very rarely.

# Filters

Filters allow you to filter out results based on specific negative criteria. This means that the results that match the specified criteria will be excluded from the output.

This can be useful when you want to focus on specific types of responses and ignore others.

These are the available filters:

text

Personal note: I use this option exclusively when I know the behavior of the target application well enough to know that error pages are not interesting for my testing.

Let's go through every single one of them.

# `-fc, -filter-code string`

This option allows you to filter out responses with specified status codes. For example, you can use -fc 404 to exclude responses with 404 status code.

While I personally don't like filtering out anything during reconnaissance, this option can be useful when you want to ignore common status codes that are not interesting for your testing, especially when you know the target application well enough to know which status codes are not interesting.

# `-fep, -filter-error-page`

This option allows you to filter out responses that are identified as error pages using machine learning-based detection. It is useful when you want to exclude error pages from your results, allowing you to focus on valid responses.

# `-fd, -filter-duplicates`

This option allows you to filter out near-duplicate responses, retaining only the first response. It is useful when you want to eliminate redundant results and focus on unique responses.

It would be useful in scenarios where you know that multiple hosts are pointing to the same application, so you can avoid testing the same application multiple times. Let's say www. domain and non-www. domain are pointing to the same application. You can avoid testing both of them, and only test one of them.

# `-fl, -filter-length string`

This option allows you to filter out responses with specified content length. For example, you can use -fl 0 to exclude responses with a content length of 0 bytes.

# `-flc, -filter-line-count string`

This option allows you to filter out responses with specified line counts in the response body. Use this only if you know the exact line counts you want to filter out.

# `-fwc, -filter-word-count string`

This option allows you to filter out responses with specified word counts in the response body. Use this only if you know the exact word counts you want to filter out.

# `-ffc, -filter-favicon string[]`

This option allows you to filter out responses with specified favicon hashes. Use this only if you know the exact favicon hashes you want to filter out. Let's say you want to exclude known favicon hashes such as GitBook, which would be hosted by a domain in your scope, but is not interesting for your testing.

# `-fs, -filter-string string[]`

This option allows you to filter out responses with specified strings in the response body. Use this only if you know the exact strings you want to filter out. For example, you can filter out responses that contain "Not Found" or "Error" messages.

# `-fe, -filter-regex string[]`

This option allows you to filter out responses with specified regular expressions in the response body. Use this only if you know the exact regex patterns you want to filter out.

# `-fcdn, -filter-cdn string[]`

This option allows you to filter out hosts with specified CDN providers. For example, you can use -fcdn cloudfront,fastly to exclude hosts using those CDN providers. It might be useful when you know that UI is served via CDN, but the API is not, so you want to focus only on the API endpoints.

# `-frt, -filter-response-time string`

This option allows you to filter out responses with specified response times. For example, you can use -frt '> 5' to exclude responses with response times greater than 5 seconds.

# `-fdc, -filter-condition string`

This option allows you to filter out responses based on DSL expression conditions. This is a powerful feature that allows you to create complex filtering conditions using a domain-specific language (DSL).

Use -ldv option to see more about the DSL syntax and available functions.

For example, you can use -fdc "status_code == 404 || content_length < 100" to exclude responses with status code 404 or content length less than 100 bytes.

Use this option for more complex filtering conditions that combine multiple criteria.

# `-strip`

This option strips all tags in the response content. Supported formats are HTML and XML (default is HTML). It is useful when you want to analyze the plain text content of the response without any markup.

# Rate-limit

Rate-limiting options allow you to control the speed of your requests to avoid overwhelming the target server or getting blocked by rate-limiting mechanisms.

text

You will mostly use these options in your automation especially when the rules of engagement specify the rate limits you should follow. Please do not ignore these rules.

# `-t, -threads int`

This option allows you to specify the number of threads to use for sending requests. Increasing the number of threads can speed up the scanning process, but it can also increase the load on the target server.

By default, httpx uses 50 threads. You can adjust this value based on the target server's capacity and your own system's resources.

# `-rl, -rate-limit int`

This option allows you to specify the maximum number of requests to send per second. It is useful when you want to control the request rate to avoid overwhelming the target server or getting blocked by rate-limiting mechanisms.

Do not exceed the rate limits specified in the rules of engagement for your testing.

# `-rlm, -rate-limit-minute int`

This option allows you to specify the maximum number of requests to send per minute. It is useful when you want to control the request rate over a longer period of time.

Do not exceed the rate limits specified in the rules of engagement for your testing.

# Miscellaneous

As the name suggests, these options do not fit into any of the previous categories, but they are still very useful.

text

# `-pa, -probe-all-ips`

This option allows you to probe all the IPs associated with the same host. It is useful when you want to test all possible IP addresses that a domain might resolve to.

This can help you identify different server configurations or versions that might be running on different IPs.

# `-p, -ports string[]`

This option allows you to specify the ports to probe using nmap syntax. For example, you can use -p http:1,2-10,11,https:80 to probe specific ports for HTTP and HTTPS services.

Use this option when you want to test non-standard ports or specific services running on the target hosts.

# `-path string`

This option allows you to specify a path or a list of paths to probe. You can provide a comma-separated list or a file containing the paths.

This is useful when you want to test specific endpoints on the target hosts, such as /admin, /login, or any other interesting paths.

# `-tls-probe`

This option allows you to send HTTP probes on the extracted TLS domains (dns_name). It is useful when you want to test the domains that are associated with the TLS certificates of the target hosts.

# `-csp-probe`

This option allows you to send HTTP probes on the extracted Content Security Policy (CSP) domains. This information is very interesting, as CSP headers often include domains that are trusted by the application.

# `-tls-grab`

This option allows you to perform TLS (SSL) data grabbing. It is useful when you want to extract TLS-related information from the target hosts, such as certificate details, supported protocols, and cipher suites.

# `-pipeline`

This option allows you to probe and display server supporting HTTP/1.1 pipeline. It is useful when you want to identify servers that support HTTP/1.1 pipelining, which can improve performance by allowing multiple requests to be sent without waiting for each response.

# `-http2`

This option allows you to probe and display servers supporting HTTP/2. It is useful when you want to identify servers that support the HTTP/2 protocol,

This is interesting when you want to test for request smuggling.

# `-vhost`

This option allows you to probe and display servers supporting virtual hosting (VHOST).

It is useful when you want to identify servers that host multiple domains on the same IP address.

# `-ldv, -list-dsl-variables`

This option allows you to list JSON output field keys names that support DSL matcher/filter.

As mentioned before, this is useful when you want to create complex matching or filtering conditions using the DSL syntax.

# Update

This option allows you to update httpx to the latest version. It is useful when you want to ensure that you are using the most recent features and bug fixes.

text

# `-up, -update`

Simply run this option to update httpx to the latest version. It is recommended to run this command periodically to ensure you have the latest features and bug fixes.

# `-duc, -disable-update-check`

This option allows you to disable the automatic httpx update check. It is useful when you want to avoid automatic update notifications, especially in automated environments where you want to control the update process manually.

You can enable this option in your automation workflows to prevent unexpected interruptions due to update checks.

I tend to not use this option, as I like to be notified when there is a new version available.

# Output

Output options allow you to control the format and destination of the results generated by httpx. This is important for integrating httpx into your workflows and for further analysis of the results.

This is one of the strong points of tools developed by ProjectDiscovery, as they provide a wide range of output options to suit different needs. They are doing an amazing job, and the fact they did such a good job at it, makes it very easy to integrate their tools into any automation workflow.

text

# `-o, -output string`

This option allows you to specify a file to write the output results. It is useful when you want to save the results for further analysis or reporting. You can specify the file path where you want to save the results. The output format will depend on the other output options you have specified (e.g., JSON, CSV).

You should use this option in your workflow since the output of the tool should be uploaded as the scan output.

yaml

# `-oa, -output-all`

This option allows you to specify a filename to write output results in all formats. It is useful when you want to save the results in multiple formats (e.g., JSON, CSV) for different types of analysis.

# `-sr, -store-response`

This option allows you to store the HTTP response to an output directory. It is useful when you want to save the full HTTP responses for further analysis or debugging.

I personally use only the -json output option, but depending on your needs, you might want to store the full responses for further analysis.

# `-srd, -store-response-dir string`

This option allows you to specify a custom directory to store the HTTP responses. It is useful when you want to organize the stored responses in a specific location.

It is useful when you want to keep the responses separate from other output files. Let's say you don't want to be notified about the changes in the actual responses, but you do want to keep them for further analysis.

You can separate two artifacts, one used for notifications, and as the input for the next steps in your workflow, and the other one used just for storing the responses for further analysis.

yaml

# `-ob, -omit-body`

This option allows you to omit the response body in the output. It is useful when you want to reduce the size of the output file, especially when dealing with large responses.

But you want to keep other information such as headers, status codes, etc.

# `-csv`

This option allows you to store the output in CSV format.

This option tend to be useful when you want to analyze the results using spreadsheet software or other tools that support CSV format.

When reporting to non-technical stakeholders, CSV format can be more accessible and easier to understand than JSON or other formats.

# `-csvo, -csv-output-encoding string`

This option allows you to define the output encoding for CSV format. It is useful when you want to ensure that the CSV file is encoded in a specific format (e.g., UTF-8, ASCII) for compatibility with other tools or systems.

# `-j, -json`

This option allows you to store the output in JSONL (JSON Lines) format. It is useful when you want to save the results in a structured format that is easy to parse and analyze programmatically.

I personally use this option the most. I like to keep json format because I can easily combine it with jq tool to parse and extract results for further analysis.

# `-irh, -include-response-header`

This option allows you to include the HTTP response headers in the JSON output. It is useful when you want to analyze the response headers for further insights or debugging.

# `-irr, -include-response`

This option allows you to include the full HTTP request and response (headers + body) in the JSON output. It is useful when you want to analyze the complete request-response cycle for further insights or debugging.

# `-irrb, -include-response-base64`

This option allows you to include the base64 encoded HTTP request and response in the JSON output. It is useful when you want to store the complete request-response cycle in a compact format for further analysis or debugging.

You might want to use this format when you are dealing with binary data in the request or response body, as base64 encoding can help preserve the integrity of the data.

# `-include-chain`

This option allows you to include the redirect HTTP chain in the JSON output. It is useful when you want to analyze the sequence of redirects that occurred during the request for further insights or debugging.

It is only available if you use -sr or -store-response option.

# `svrc, -store-vision-recon-cluster`

This option allows you to include visual recon clusters in the output. It is useful when you want to group similar screenshots together for easier analysis.

It only worS when both -ss (screenshot) and -sr (store response) options are used.

# `-pr, -protocol string`

This option allows you to specify the protocol to use (e.g., unknown, http11). In my experience, this option is only useful when you want to make sure certain protocol is used, which is rarely the case.

Setting http11 disables http2.

# `-fepp, -filter-error-page-path string`

This option allows you to store filtered error pages to a specified path. The default path is "filtered_error_page.json".

# Config

Configuration options allow you to customize the behavior of httpx based on your specific needs and requirements.

text

# `-config string`

This option allows you to specify the path to the httpx configuration file. By default, httpx looks for the configuration file at $HOME/.config/httpx/config.yaml.

You can use this option to load a custom configuration file that contains your preferred settings and options for httpx.

You can keep custom configuration files in blob storage. Then, you can download it as part of your automation workflow, and use that configuration file with httpx.

This would keep the configuration management easier, especially when you have multiple automation workflows that use httpx.

# `-r, -resolvers string[]`

This option allows you to specify a list of custom DNS resolvers to use for resolving hostnames.

Some resolvers might be faster than others, or they might provide better coverage for certain domains.

If you have specific DNS resolvers that you trust or prefer, you can use this option to ensure that httpx uses those resolvers for DNS resolution.

# `-allow string[]`

This option allows you to specify an allowed list of IP addresses or CIDR ranges to process. Only the hosts that resolve to the specified IPs or CIDR ranges will be probed.

It is useful when you want to only allow specific IP/CIDR's to be probed, ensuring that you stay within the scope of your testing.

# `-deny string[]`

This option allows you to specify a denied list of IP addresses or CIDR ranges to process. The hosts that resolve to the specified IPs or CIDR ranges will be excluded from probing.

It is useful when you want to exclude specific IP/CIDR's from being probed, ensuring that you avoid testing hosts that are out of scope or not relevant to your testing.

# `-sni, -sni-name string`

This option allows you to specify a custom TLS SNI (Server Name Indication) name to use for TLS connections. It is useful when you want to test hosts that require a specific SNI name for proper TLS negotiation.

# `-random-agent`

This option enables the use of a random User-Agent header for each request. By default, this option is set to true.

Using a random User-Agent can help you avoid detection and blocking by target servers that may filter requests based on User-Agent strings.

# `-H, -header string[]`

This option allows you to specify custom HTTP headers to send with each request. You can provide multiple headers by using this option multiple times.

It is useful when you want to include specific headers that are required by the target application or when you want to mimic certain client behaviors. It is also sometimes required by the target application to function properly. Please, respect the rules of engagement.

# `-http-proxy, -proxy string`

This option allows you to specify a proxy (HTTP or SOCKS) to use for sending requests.

It is useful when you want to route your requests through a proxy server for anonymity, bypassing network restrictions, or for logging and monitoring purposes.

It might also be useful to use a proxy when you want to store all the httpx traffic in your interceping proxy for further analysis. This way, you can see all the requests and responses in your proxy tool.

# `-unsafe`

This option allows you to send raw requests, skipping Go's normalization process.

Go normalizes certain aspects of HTTP requests, which can sometimes lead to unexpected behavior when interacting with certain servers.

# `-resume`

This option allows you to resume a scan using a previously saved resume file (resume.cfg). It is rarely useful in automation, but it is very useful when you are running a long scan manually, and you want to resume it after an interruption.

# `-fr, -follow-redirects`

This option allows you to follow HTTP redirects (3xx responses) automatically. By default, httpx does not follow redirects.

It is useful when you want to ensure that you reach the final destination of a request, even if it involves multiple redirects.

# `-maxr, -max-redirects int`

This option allows you to set the maximum number of redirects to follow before giving up. The default value is 10.

# `-fhr, -follow-host-redirects`

This option allows you to follow redirects only on the same host. It is useful when you want to avoid following redirects to different domains, which might be out of scope for your testing.

# `-rhsts, -respect-hsts`

This option allows you to respect HSTS (HTTP Strict Transport Security) response headers for redirect requests. It is useful when you want to ensure that your requests adhere to the security policies set by the target server.

# `-vhost-input`

Get a list of vhosts as input. This is useful when you have a list of virtual hosts that you want to probe.

# `-x string`

This option allows you to specify the HTTP request methods to probe. You can use 'all' to probe all HTTP methods (GET, POST, PUT, DELETE, etc.).

It is very useful when you want to test different HTTP methods for the target hosts, especially when you are looking for misconfigurations or vulnerabilities related to specific methods.

Use this option when you want to ensure that you are testing all possible methods that the target application might support. In my opinion, this option should be used when you testing known web applications that might have misconfigurations related to HTTP methods.

# `-body string`

This option allows you to specify a POST body to include in the HTTP request. It is useful when you want to test endpoints that require a request body, such as login forms or API endpoints.

You can use these options along with -x in your automated script that you can invoker.

To make sure that you understand how you can best automate your workflows, it is worth noting that the runner is running on your own machine. The BountyHub platform is clueless about how you are invoking the tool, so you can use any options that you want.

Understanding that, you can just as easily create exploitation scripts that you store locally on your machine, or keep them in the blob storage on the platform, and invoke your script which leverages the httpx tool to perform further testing.

From my point of view, these specialized options are best used in circumstances where you already identified potential targets, and you know that sending a certain payload might yield interesting results.

# `-s, -stream`

This option enables stream mode, which starts processing input targets without sorting them. It is useful when you want to start probing targets immediately without waiting for the entire input to be processed.

# `-sd, -skip-dedupe`

This option disables deduplication of input items when using stream mode. It is useful when you want to ensure that all input items are processed, even if they are duplicates.

# `-ldp, -leave-default-ports`

This option allows you to leave default HTTP/HTTPS ports in the Host header (e.g., http://host:80, https://host:443). It is useful when you want to ensure that the Host header accurately reflects the port being used for the request.

# `-ztls`

This option enables the use of the ztls library with autofallback to the standard Go TLS library for TLS 1.3 connections. It is useful when you want to leverage the features of the ztls library while maintaining compatibility with the standard library.

# `-no-decode`

This option prevents httpx from decoding the response body. It is useful when you want to analyze the raw response data without any decoding applied.

# `-tlsi, -tls-impersonate`

This option allows you to impersonate a specific TLS client certificate when making requests. It is useful when you want to test how a server responds to requests from a specific client.

The TLS impersonation feature is experimental and may not work with all servers. Use this option with caution and only when necessary.

# `-no-stdin`

Disable Stdin processing. This is useful when you want to ensure that httpx does not read input from standard input.

# `-hae, -http-api-endpoint string`

This option allows you to specify an experimental HTTP API endpoint. It is useful when you want to interact with httpx through an HTTP API for automation or integration purposes.

# Debug

We can skip debug options for now, as they are mostly useful for developers of httpx itself.

However, in order to be complete, here they are:

text

# Optimizations

Optimization options allow you to fine-tune the performance of httpx based on your specific use case and environment.

text

These optimizations should rarely be used, but they are useful in situations where you understand the target environment well enough to know that certain optimizations will improve performance without sacrificing accuracy.

# `-nf, -no-fallback`

This option allows you to display both probed protocols (HTTPS and HTTP) without falling back to the other protocol if one fails.

It is useful when you want to see the results for both protocols regardless of their success or failure.

# `-nfs, -no-fallback-scheme`

This option allows you to probe with the protocol scheme specified in the input without falling back to the other protocol if one fails.

It is useful when you want to strictly adhere to the specified protocol scheme for each input.

# `-maxhr, -max-host-error int`

This option allows you to set the maximum error count per host before skipping the remaining paths. The default value is 30.

It is useful when you want to avoid wasting time on hosts that are consistently failing, allowing you to focus on more promising targets.

# `-e, -exclude string[]`

This option allows you to exclude hosts matching specified filters, such as 'cdn', 'private-ips', CIDR, IP, or regex patterns.

It is useful when you want to avoid probing certain types of hosts that are not relevant to your testing.

# `-retries int`

This option allows you to specify the number of retries for failed requests. It is useful when you want to increase the chances of successfully probing hosts that may have intermittent connectivity issues.

It might be useful to add 5 retries when you are probing a large scope, and you want to ensure that transient network issues do not lead to missed hosts.

# `-timeout int`

This option allows you to specify the timeout in seconds for each request. The default value is 10 seconds. This is usually sufficient for most scenarios, but you can adjust it based on the expected response times of the target hosts.

# `-delay value`

This option allows you to specify a duration between each HTTP request (e.g., 200ms, 1s). The default value is -1ns, which means no delay. But it is rather useful when you want to avoid overwhelming the target server or getting blocked by rate-limiting mechanisms.

# `-rsts, -response-size-to-save int`

If you want to limit the maximum response size to save in bytes, you can use this option. The default value is 2147483647 bytes (approximately 2GB). It is unlikely to be hit, but if you want to save the storage space, you can set a lower limit.

# `-rstr, -response-size-to-read int`

This option allows you to specify the maximum response size to read in bytes. The default value is 2147483647 bytes (approximately 2GB). It is unlikely to be hit, but if you want to save memory usage, you can set a lower limit.

# Cloud

The cloud option allows you to integrate httpx with ProjectDiscovery's cloud services for enhanced features and capabilities.

You should definitely take a look at ProjectDiscovery's cloud services, as they provide additional features such as centralized management, reporting, and collaboration capabilities.

I need to mention that they are doing an amazing job at it, and their cloud services are very well integrated with their tools.

I would personally highly recommend using their cloud services if you are using multiple ProjectDiscovery tools in your workflow.

You can use both platforms (BountyHub and ProjectDiscovery Cloud) simultaneously, as they provide different features and capabilities, and they can complement each other well. You can keep using BountyHub for periodically/continuously monitoring your assets, while using ProjectDiscovery Cloud for managing your tools and scans.

text

# BountyHub Integration

As with everything, let's see a practical example of how to use httpx in a BountyHub workflow.

Let's suppose that we have a scan that discovers subdomains using subfinder tool. Since we already dove deep into subfinder in the previous blog post, we won't repeat ourselves here.

Now, what is our plan?

On every successful run of the subfinder scan, we want to trigger the httpx scan.
We will download the discovered subdomains from the subfinder scan artifact.
We will transform the output (since we used -json output option in subfinder) to extract only the hostnames.
We will run httpx against the discovered subdomains to probe for live hosts and gather additional information.
We will upload the results, such as:
- The httpx results in JSONL format. We want to be notified on changes to this artifact.
- The HTTP responses for further analysis. We don't want to be notified on changes to this artifact, since fingerprinting is already contained in the JSONL output.
- Screenshots of the live hosts. We don't want to be notified on changes to this artifact, since screenshots tend to change often. We will set an expiration time of 15 days for this artifact to save storage space.
We will use the BOUNTYHUB_TOKEN secret to authenticate with the BountyHub platform.

Let's get to it

yaml

All the flags used above are documented and explained in the previous sections of this blog post.

The thing to note is that my runner has chromium installed, so I'm using -system-chrome option to leverage the installed chromium for taking screenshots. By default, it tries to run sandboxed chromium, which might not be available in your environment.

# Conclusion

In this blog post, we have explored the various options and features of httpx, a powerful HTTP probing tool developed by ProjectDiscovery.

We covered the different categories of options, including matchers, extractors, filters, rate-limiting, output formats, and more.

By understanding and utilizing these options effectively, you can tailor httpx to suit your specific reconnaissance and testing needs.

I encourage you to experiment with the different options and combinations to find the best approach for your use cases. Happy probing!

# Last words

If you have any suggestions about the things to cover on this blog, please do not hesitate to reach out!

All contacts are available on the Support page, but I would advise you raising it on Discord.

No headings in this post