-
-
Notifications
You must be signed in to change notification settings - Fork 763
BBOT 3.0 - blazed_elijah #2007
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
BBOT 3.0 - blazed_elijah #2007
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## dev #2007 +/- ##
======================================
+ Coverage 92% 92% +1%
======================================
Files 411 428 +17
Lines 34044 34825 +781
======================================
+ Hits 31064 31799 +735
- Misses 2980 3026 +46 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
… into scope-rework-patch
Scope rework patch
Fix multiprocess test failure
| assert not preset_domain_with_seed_baked.in_scope("www.evilcorp.de") | ||
| assert not preset_domain_with_seed_baked.in_scope("1.2.3.4/24") | ||
|
|
||
| assert "www.evilcorp.org" in preset_with_target_scope_baked.target.seeds |
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization High test
www.evilcorp.org
Copilot Autofix
AI 23 days ago
Copilot could not generate an autofix suggestion
Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.
| "whitelist": ["1.2.3.0/24", "http://evilcorp.net/"], | ||
| "blacklist": ["bob@evilcorp.co.uk", "evilcorp.co.uk:443"], | ||
| "config": {"modules": {"secretsdb": {"otherthing": "asdf"}}}, | ||
| assert "www.evilcorp.org" in preset_domain_with_seed_baked.seeds |
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization High test
www.evilcorp.org
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 23 days ago
The best way to fix this is to ensure that we are not using a substring match on potentially unsafe elements. Instead of using "www.evilcorp.org" in preset_domain_with_seed_baked.seeds, we should compare the relevant hostname/target directly and robustly.
If preset_domain_with_seed_baked.seeds is a list or set of objects (possibly with .data attributes) which represent hostnames, then the correct approach is to check either the equality of these .data fields to the string in question, or to parse any potential URLs to extract hostnames, and then compare hostnames only.
Given the context of the file (assertions in test code that use {e.data for e in ...} in previous lines), it's clear that .data are the items -- which appear to be strings representing hosts or perhaps URLs.
Therefore, we should rewrite the problematic code to iterate over preset_domain_with_seed_baked.seeds, extract the relevant hostname (using urlparse if necessary), and check for exact matches.
Specifically:
Replace:
assert "www.evilcorp.org" in preset_domain_with_seed_baked.seedsWith:
from urllib.parse import urlparse
assert any(
(urlparse(e.data).hostname or e.data) == "www.evilcorp.org"
for e in preset_domain_with_seed_baked.seeds
)And similarly for "www.evilcorp.com" in preset_domain_with_seed_baked.seeds.
Also, ensure the import for urlparse is present at the top of the file.
-
Copy modified line R4 -
Copy modified lines R324-R331
| @@ -1,8 +1,8 @@ | ||
| from ..bbot_fixtures import * # noqa F401 | ||
|
|
||
| from bbot.scanner import Scanner, Preset | ||
| from urllib.parse import urlparse | ||
|
|
||
|
|
||
| # FUTURE TODO: | ||
| # Consider testing possible edge cases: | ||
| # make sure custom module load directory works with cli arg module/flag/config syntax validation | ||
| @@ -322,8 +321,14 @@ | ||
| "1.2.3.0/24", | ||
| "http://evilcorp.net/", | ||
| } | ||
| assert "www.evilcorp.org" in preset_domain_with_seed_baked.seeds | ||
| assert "www.evilcorp.com" in preset_domain_with_seed_baked.seeds | ||
| assert any( | ||
| (urlparse(e.data).hostname or e.data) == "www.evilcorp.org" | ||
| for e in preset_domain_with_seed_baked.seeds | ||
| ) | ||
| assert any( | ||
| (urlparse(e.data).hostname or e.data) == "www.evilcorp.com" | ||
| for e in preset_domain_with_seed_baked.seeds | ||
| ) | ||
| assert "1.2.3.4" in preset_domain_with_seed_baked.target.target | ||
| assert not preset_domain_with_seed_baked.in_scope("www.evilcorp.org") | ||
| # After merging, evilcorp.com remains in target, so its www subdomain is in-scope and in-target |
| "blacklist": ["bob@evilcorp.co.uk", "evilcorp.co.uk:443"], | ||
| "config": {"modules": {"secretsdb": {"otherthing": "asdf"}}}, | ||
| assert "www.evilcorp.org" in preset_domain_with_seed_baked.seeds | ||
| assert "www.evilcorp.com" in preset_domain_with_seed_baked.seeds |
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization High test
www.evilcorp.com
Copilot Autofix
AI 23 days ago
Copilot could not generate an autofix suggestion
Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.
| assert {e.data for e in preset_nowhitelist1_baked.whitelist} == {"evilcorp.com"} | ||
| assert {e.data for e in preset_nowhitelist2_baked.whitelist} == {"evilcorp.com", "evilcorp.de"} | ||
| # Seed expansion only applies to explicit seeds (evilcorp.org), not merged targets. | ||
| assert "www.evilcorp.org" in preset_with_target_scope_baked.seeds |
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization High test
www.evilcorp.org
Copilot Autofix
AI 23 days ago
Copilot could not generate an autofix suggestion
Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.
| assert "www.evilcorp.com" not in preset_with_target_scope_baked.seeds | ||
| # Target expansion only applies to targets (evilcorp.com), not seeds-only domains. | ||
| assert "www.evilcorp.org" not in preset_with_target_scope_baked.target.target | ||
| assert "www.evilcorp.com" in preset_with_target_scope_baked.target.target |
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization High test
www.evilcorp.com
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 23 days ago
The fix involves replacing any substring/list membership checks on domains, such as "www.evilcorp.com" in preset_with_target_scope_baked.target.target, with a comparison based on parsed/canonicalized hostnames. Specifically:
- Instead of comparing string equality or using substring inclusion, the code should parse each string in
preset_with_target_scope_baked.target.targetusing Python'surllib.parse.urlparse(for URLs) or otherwise verify that each is an exact match to a hostname/domain (for raw host/domain/IP entries). - For each target string, extract its hostname using
urlparseif it may be a URL, or treat it as a hostname directly if that is the convention. - Then use exact equality checks (
==) for host/domain candidates, or use.endswith()with a dot prefix if subdomains should be allowed (.evilcorp.com). - This edit should be local to comparison lines in assertions and similar checks.
- Add an import for
urlparsefromurllib.parse.
-
Copy modified lines R353-R357
| @@ -350,7 +350,11 @@ | ||
| assert "www.evilcorp.com" not in preset_with_target_scope_baked.seeds | ||
| # Target expansion only applies to targets (evilcorp.com), not seeds-only domains. | ||
| assert "www.evilcorp.org" not in preset_with_target_scope_baked.target.target | ||
| assert "www.evilcorp.com" in preset_with_target_scope_baked.target.target | ||
| from urllib.parse import urlparse | ||
| assert any( | ||
| urlparse(target).hostname == "www.evilcorp.com" or target == "www.evilcorp.com" | ||
| for target in preset_with_target_scope_baked.target.target | ||
| ) | ||
| # Scope/target checks reflect that only evilcorp.com is in the merged target. | ||
| assert not preset_with_target_scope_baked.in_scope("www.evilcorp.org") | ||
| assert preset_with_target_scope_baked.in_scope("www.evilcorp.com") |
|
|
||
| # Verify seeds are accessible | ||
| assert "seed1.example.com" in scan.target.seeds.inputs, "seed1.example.com should be in seeds" | ||
| assert "seed2.example.com" in scan.target.seeds.inputs, "seed2.example.com should be in seeds" |
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization High test
seed2.example.com
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 23 days ago
To fix the problem, we should ensure that membership tests check for exact (canonical) matches rather than substring matches. In Python, foo in bar where bar is a string and foo is a substring will return True if and only if foo appears anywhere within bar, which is dangerous for URLs or FQDNs. However, if bar is a list or set, then it checks for exact matches, which is safe.
Here, the test uses "seed2.example.com" in scan.target.seeds.inputs. If scan.target.seeds.inputs is a string type, we must change this to equality (or, better, to compare against a list/set). If it is a list or set, leave as is, but verify.
Best practice:
- Ensure
scan.target.seeds.inputsis a collection (list,set, etc.). If it is a string, convert it to a collection for precise membership testing. - Use equality comparison or set/list membership (
in collection) instead of string containment. - If the values are URLs, extract the hostname portion before comparison (using
urllib.parse).
As we are only provided the test file and not the implementation of inputs, we'll adapt the test to be robust regardless: if scan.target.seeds.inputs is a string, cast to a collection before testing; otherwise, let the test proceed.
-
Copy modified lines R114-R120
| @@ -111,9 +111,13 @@ | ||
| assert not scan.in_target("seed2.example.com"), "Seed DNS name should not be in target" | ||
|
|
||
| # Verify seeds are accessible | ||
| assert "seed1.example.com" in scan.target.seeds.inputs, "seed1.example.com should be in seeds" | ||
| assert "seed2.example.com" in scan.target.seeds.inputs, "seed2.example.com should be in seeds" | ||
| assert "192.168.1.0/24" not in scan.target.seeds.inputs, ( | ||
| # Ensure robust, explicit membership check for exact match | ||
| seeds_inputs = scan.target.seeds.inputs | ||
| if isinstance(seeds_inputs, str): | ||
| seeds_inputs = [seeds_inputs] | ||
| assert "seed1.example.com" in seeds_inputs, "seed1.example.com should be in seeds" | ||
| assert "seed2.example.com" in seeds_inputs, "seed2.example.com should be in seeds" | ||
| assert "192.168.1.0/24" not in seeds_inputs, ( | ||
| "Target should not be in seeds when seeds are explicitly provided" | ||
| ) | ||
|
|
| assert "8.8.8.8/29" not in scan1.target.target | ||
| assert "2001:4860:4860::8889" in scan1.target.target | ||
| assert "2001:4860:4860::888c" not in scan1.target.target | ||
| assert "www.api.publicapis.org" in scan1.target.target |
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization High test
www.api.publicapis.org
Copilot Autofix
AI 23 days ago
Copilot could not generate an autofix suggestion
Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.
| assert "2001:4860:4860::8889" in scan1.target.target | ||
| assert "2001:4860:4860::888c" not in scan1.target.target | ||
| assert "www.api.publicapis.org" in scan1.target.target | ||
| assert "api.publicapis.org" in scan1.target.target |
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization High test
api.publicapis.org
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 23 days ago
The fix involves parsing the URLs (or targets) in scan1.target.target and validating that one of them has a host equal to (or, if desired, ending with) "api.publicapis.org". This can be done using Python's urllib.parse.urlparse to extract the hostname from each string. To maintain the original test's semantics, which seem to expect "api.publicapis.org" to be present as an atomic host in at least one entry, the check should parse all targets, extract their hostnames, and assert that "api.publicapis.org" is present in the set of hostnames (not in the full string arbitrarily). The change is confined to line 68, possibly adding helper code to parse and check hostnames. An appropriate import for urlparse should be added, since it's not present in the context provided.
-
Copy modified line R9 -
Copy modified lines R69-R72
| @@ -6,6 +6,7 @@ | ||
| from radixtarget import RadixTarget | ||
| from ipaddress import ip_address, ip_network | ||
| from bbot.scanner.target import BBOTTarget, ScanSeeds | ||
| from urllib.parse import urlparse | ||
|
|
||
| scan1 = bbot_scanner("api.publicapis.org", "8.8.8.8/30", "2001:4860:4860::8888/126") | ||
| scan2 = bbot_scanner("8.8.8.8/29", "publicapis.org", "2001:4860:4860::8888/125") | ||
| @@ -65,7 +66,10 @@ | ||
| assert "2001:4860:4860::8889" in scan1.target.target | ||
| assert "2001:4860:4860::888c" not in scan1.target.target | ||
| assert "www.api.publicapis.org" in scan1.target.target | ||
| assert "api.publicapis.org" in scan1.target.target | ||
| assert any( | ||
| (urlparse(t).hostname == "api.publicapis.org" if "://" in t else t == "api.publicapis.org") | ||
| for t in scan1.target.target | ||
| ) | ||
| assert "publicapis.org" not in scan1.target.target | ||
| assert "bob@www.api.publicapis.org" in scan1.target.target | ||
| assert "https://www.api.publicapis.org" in scan1.target.target |
| assert "api.publicapis.org" in scan1.target.target | ||
| assert "publicapis.org" not in scan1.target.target | ||
| assert "bob@www.api.publicapis.org" in scan1.target.target | ||
| assert "https://www.api.publicapis.org" in scan1.target.target |
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization High test
https://www.api.publicapis.org
Copilot Autofix
AI 23 days ago
Copilot could not generate an autofix suggestion
Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.
| assert "publicapis.org" not in scan1.target.target | ||
| assert "bob@www.api.publicapis.org" in scan1.target.target | ||
| assert "https://www.api.publicapis.org" in scan1.target.target | ||
| assert "www.api.publicapis.org:80" in scan1.target.target |
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization High test
www.api.publicapis.org:80
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 23 days ago
The problem is that the assertion checks whether "www.api.publicapis.org:80" is a substring of scan1.target.target. This only guarantees that somewhere, possibly as a substring or embedded in a larger string, the value exists. If scan1.target.target is a list/set of URLs or strings, we should check whether it contains a host and port matching the intended one, not just a substring.
To fix this, parse the target collection (likely a list/set of addresses/URLs), iterate over its elements, and, for each entry, parse it as a URL or split into host and port as appropriate. Check for exact match—i.e., is there an entry whose host is "www.api.publicapis.org" and port is "80" (or which matches the tuple "www.api.publicapis.org:80"), instead of relying on substring search.
This may require importing Python's urllib.parse for proper URL decomposition, or, if scan1.target.target is a collection of strings, a simple loop with parsed matching. The assertion should be changed from:
assert "www.api.publicapis.org:80" in scan1.target.targetto logic that loops over the target set and does an exact match on hostname and port (after parsing). If the entry is not a full URL but a "host:port" string, split on the first colon and compare.
-
Copy modified lines R72-R101
| @@ -69,7 +69,36 @@ | ||
| assert "publicapis.org" not in scan1.target.target | ||
| assert "bob@www.api.publicapis.org" in scan1.target.target | ||
| assert "https://www.api.publicapis.org" in scan1.target.target | ||
| assert "www.api.publicapis.org:80" in scan1.target.target | ||
| # Ensure that www.api.publicapis.org:80 is present as a full host:port, not just as a substring | ||
| from urllib.parse import urlparse | ||
| def has_host_port(targets, host, port): | ||
| for item in targets: | ||
| parsed = urlparse(item) | ||
| # Handle cases like raw host:port (e.g., "[2001:4860:4860::8888]:80") | ||
| # or full URL (e.g., "https://www.api.publicapis.org:80" or "www.api.publicapis.org:80") | ||
| # first, check if scheme is present | ||
| if parsed.scheme: | ||
| netloc = parsed.netloc | ||
| else: | ||
| netloc = parsed.path | ||
| # Remove possible leading/trailing brackets/spaces | ||
| netloc = netloc.strip(" []") | ||
| # Split for IPv6 addresses or host:port | ||
| if netloc.count(":") > 1: # IPv6 | ||
| # e.g. [2001:4860:4860::8888]:80 | ||
| if netloc.startswith("[") and "]:" in netloc: | ||
| addr, port_part = netloc[1:].split("]:", 1) | ||
| if addr == host and port_part == port: | ||
| return True | ||
| else: | ||
| if ":" in netloc: | ||
| host_part, port_part = netloc.rsplit(":", 1) | ||
| if host_part == host and port_part == port: | ||
| return True | ||
| elif netloc == host and (parsed.port == int(port) if parsed.port else False): | ||
| return True | ||
| return False | ||
| assert has_host_port(scan1.target.target, "www.api.publicapis.org", "80") | ||
| assert scan1.make_event("https://[2001:4860:4860::8888]:80", dummy=True) in scan1.target.target | ||
| assert scan1.make_event("[2001:4860:4860::8888]:80", "OPEN_TCP_PORT", dummy=True) in scan1.target.target | ||
| assert scan1.make_event("[2001:4860:4860::888c]:80", "OPEN_TCP_PORT", dummy=True) not in scan1.target.target |
Summary
BBOT 3.0 "
blazed_elijah" contains changes needed to store BBOT data in a persistent database. The idea is to release it alongside BBOT server, a tiny CLI-only database. This will be paired with a series of blog posts showing how BBOT server can be used on the command line to script out bug bounty hunting, threat intel, and ASM (i.e. running scheduled scans, exporting to CSV, diffing results over time, etc.).Together, BBOT 3.0 and BBOT server will give us a solid foundation to build a bunch of other useful tooling, like asset inventory. Sometime in the future, it may also be useful to frontend.
Breaking changes
1.
.dataand.data_jsonevent fieldsThe main breaking change in BBOT 3.0 is that the name of the
.datafield is different based on whether it's astrordict..data: string.data_json: dictionaryThe
siem_friendlyoption has been removed, since BBOT data is now SIEM-friendly by default.2. Changes to vulnerabilities
The
VULNERABILITYevent type has been removed in favor ofFINDING, which now has several improvements:namefield which holds a generic description common to all findings of the same type. This makes it easier to collapse and categorize them.confidencefieldseverityfieldFeatures
Potential changes