Skip to content

Conversation

@TheTechromancer
Copy link
Collaborator

@TheTechromancer TheTechromancer commented Nov 22, 2024

Summary

BBOT 3.0 "blazed_elijah" contains changes needed to store BBOT data in a persistent database. The idea is to release it alongside BBOT server, a tiny CLI-only database. This will be paired with a series of blog posts showing how BBOT server can be used on the command line to script out bug bounty hunting, threat intel, and ASM (i.e. running scheduled scans, exporting to CSV, diffing results over time, etc.).

Together, BBOT 3.0 and BBOT server will give us a solid foundation to build a bunch of other useful tooling, like asset inventory. Sometime in the future, it may also be useful to frontend.

Breaking changes

1. .data and .data_json event fields

The main breaking change in BBOT 3.0 is that the name of the .data field is different based on whether it's a str or dict.

  • .data: string
  • .data_json: dictionary

The siem_friendly option has been removed, since BBOT data is now SIEM-friendly by default.

2. Changes to vulnerabilities

The VULNERABILITY event type has been removed in favor of FINDING, which now has several improvements:

  • A name field which holds a generic description common to all findings of the same type. This makes it easier to collapse and categorize them.
  • A confidence field
  • A severity field

Features

Potential changes

@TheTechromancer TheTechromancer changed the base branch from stable to dev November 22, 2024 01:58
@TheTechromancer TheTechromancer self-assigned this Nov 22, 2024
@codecov
Copy link

codecov bot commented Nov 22, 2024

Codecov Report

❌ Patch coverage is 93.77457% with 95 lines in your changes missing coverage. Please review.
✅ Project coverage is 92%. Comparing base (c977c26) to head (95b5dd4).
⚠️ Report is 18 commits behind head on dev.

Files with missing lines Patch % Lines
bbot/constants.py 71% 11 Missing ⚠️
bbot/scanner/scanner.py 87% 11 Missing ⚠️
bbot/modules/base.py 68% 10 Missing ⚠️
bbot/modules/output/nats.py 80% 7 Missing ⚠️
bbot/core/event/base.py 88% 6 Missing ⚠️
bbot/models/pydantic.py 94% 6 Missing ⚠️
bbot/modules/output/mongo.py 90% 6 Missing ⚠️
bbot/modules/internal/excavate.py 80% 5 Missing ⚠️
bbot/core/config/logger.py 20% 4 Missing ⚠️
bbot/modules/output/zeromq.py 88% 4 Missing ⚠️
... and 11 more
Additional details and impacted files
@@          Coverage Diff           @@
##             dev   #2007    +/-   ##
======================================
+ Coverage     92%     92%    +1%     
======================================
  Files        411     428    +17     
  Lines      34044   34825   +781     
======================================
+ Hits       31064   31799   +735     
- Misses      2980    3026    +46     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

assert not preset_domain_with_seed_baked.in_scope("www.evilcorp.de")
assert not preset_domain_with_seed_baked.in_scope("1.2.3.4/24")

assert "www.evilcorp.org" in preset_with_target_scope_baked.target.seeds

Check failure

Code scanning / CodeQL

Incomplete URL substring sanitization High test

The string
www.evilcorp.org
may be at an arbitrary position in the sanitized URL.

Copilot Autofix

AI 23 days ago

Copilot could not generate an autofix suggestion

Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.

"whitelist": ["1.2.3.0/24", "http://evilcorp.net/"],
"blacklist": ["bob@evilcorp.co.uk", "evilcorp.co.uk:443"],
"config": {"modules": {"secretsdb": {"otherthing": "asdf"}}},
assert "www.evilcorp.org" in preset_domain_with_seed_baked.seeds

Check failure

Code scanning / CodeQL

Incomplete URL substring sanitization High test

The string
www.evilcorp.org
may be at an arbitrary position in the sanitized URL.

Copilot Autofix

AI 23 days ago

The best way to fix this is to ensure that we are not using a substring match on potentially unsafe elements. Instead of using "www.evilcorp.org" in preset_domain_with_seed_baked.seeds, we should compare the relevant hostname/target directly and robustly.

If preset_domain_with_seed_baked.seeds is a list or set of objects (possibly with .data attributes) which represent hostnames, then the correct approach is to check either the equality of these .data fields to the string in question, or to parse any potential URLs to extract hostnames, and then compare hostnames only.

Given the context of the file (assertions in test code that use {e.data for e in ...} in previous lines), it's clear that .data are the items -- which appear to be strings representing hosts or perhaps URLs.

Therefore, we should rewrite the problematic code to iterate over preset_domain_with_seed_baked.seeds, extract the relevant hostname (using urlparse if necessary), and check for exact matches.

Specifically:
Replace:

assert "www.evilcorp.org" in preset_domain_with_seed_baked.seeds

With:

from urllib.parse import urlparse

assert any(
    (urlparse(e.data).hostname or e.data) == "www.evilcorp.org"
    for e in preset_domain_with_seed_baked.seeds
)

And similarly for "www.evilcorp.com" in preset_domain_with_seed_baked.seeds.

Also, ensure the import for urlparse is present at the top of the file.


Suggested changeset 1
bbot/test/test_step_1/test_presets.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/bbot/test/test_step_1/test_presets.py b/bbot/test/test_step_1/test_presets.py
--- a/bbot/test/test_step_1/test_presets.py
+++ b/bbot/test/test_step_1/test_presets.py
@@ -1,8 +1,8 @@
 from ..bbot_fixtures import *  # noqa F401
 
 from bbot.scanner import Scanner, Preset
+from urllib.parse import urlparse
 
-
 # FUTURE TODO:
 # Consider testing possible edge cases:
 #  make sure custom module load directory works with cli arg module/flag/config syntax validation
@@ -322,8 +321,14 @@
         "1.2.3.0/24",
         "http://evilcorp.net/",
     }
-    assert "www.evilcorp.org" in preset_domain_with_seed_baked.seeds
-    assert "www.evilcorp.com" in preset_domain_with_seed_baked.seeds
+    assert any(
+        (urlparse(e.data).hostname or e.data) == "www.evilcorp.org"
+        for e in preset_domain_with_seed_baked.seeds
+    )
+    assert any(
+        (urlparse(e.data).hostname or e.data) == "www.evilcorp.com"
+        for e in preset_domain_with_seed_baked.seeds
+    )
     assert "1.2.3.4" in preset_domain_with_seed_baked.target.target
     assert not preset_domain_with_seed_baked.in_scope("www.evilcorp.org")
     # After merging, evilcorp.com remains in target, so its www subdomain is in-scope and in-target
EOF
@@ -1,8 +1,8 @@
from ..bbot_fixtures import * # noqa F401

from bbot.scanner import Scanner, Preset
from urllib.parse import urlparse


# FUTURE TODO:
# Consider testing possible edge cases:
# make sure custom module load directory works with cli arg module/flag/config syntax validation
@@ -322,8 +321,14 @@
"1.2.3.0/24",
"http://evilcorp.net/",
}
assert "www.evilcorp.org" in preset_domain_with_seed_baked.seeds
assert "www.evilcorp.com" in preset_domain_with_seed_baked.seeds
assert any(
(urlparse(e.data).hostname or e.data) == "www.evilcorp.org"
for e in preset_domain_with_seed_baked.seeds
)
assert any(
(urlparse(e.data).hostname or e.data) == "www.evilcorp.com"
for e in preset_domain_with_seed_baked.seeds
)
assert "1.2.3.4" in preset_domain_with_seed_baked.target.target
assert not preset_domain_with_seed_baked.in_scope("www.evilcorp.org")
# After merging, evilcorp.com remains in target, so its www subdomain is in-scope and in-target
Copilot is powered by AI and may make mistakes. Always verify output.
Unable to commit as this autofix suggestion is now outdated
"blacklist": ["bob@evilcorp.co.uk", "evilcorp.co.uk:443"],
"config": {"modules": {"secretsdb": {"otherthing": "asdf"}}},
assert "www.evilcorp.org" in preset_domain_with_seed_baked.seeds
assert "www.evilcorp.com" in preset_domain_with_seed_baked.seeds

Check failure

Code scanning / CodeQL

Incomplete URL substring sanitization High test

The string
www.evilcorp.com
may be at an arbitrary position in the sanitized URL.

Copilot Autofix

AI 23 days ago

Copilot could not generate an autofix suggestion

Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.

assert {e.data for e in preset_nowhitelist1_baked.whitelist} == {"evilcorp.com"}
assert {e.data for e in preset_nowhitelist2_baked.whitelist} == {"evilcorp.com", "evilcorp.de"}
# Seed expansion only applies to explicit seeds (evilcorp.org), not merged targets.
assert "www.evilcorp.org" in preset_with_target_scope_baked.seeds

Check failure

Code scanning / CodeQL

Incomplete URL substring sanitization High test

The string
www.evilcorp.org
may be at an arbitrary position in the sanitized URL.

Copilot Autofix

AI 23 days ago

Copilot could not generate an autofix suggestion

Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.

assert "www.evilcorp.com" not in preset_with_target_scope_baked.seeds
# Target expansion only applies to targets (evilcorp.com), not seeds-only domains.
assert "www.evilcorp.org" not in preset_with_target_scope_baked.target.target
assert "www.evilcorp.com" in preset_with_target_scope_baked.target.target

Check failure

Code scanning / CodeQL

Incomplete URL substring sanitization High test

The string
www.evilcorp.com
may be at an arbitrary position in the sanitized URL.

Copilot Autofix

AI 23 days ago

The fix involves replacing any substring/list membership checks on domains, such as "www.evilcorp.com" in preset_with_target_scope_baked.target.target, with a comparison based on parsed/canonicalized hostnames. Specifically:

  • Instead of comparing string equality or using substring inclusion, the code should parse each string in preset_with_target_scope_baked.target.target using Python's urllib.parse.urlparse (for URLs) or otherwise verify that each is an exact match to a hostname/domain (for raw host/domain/IP entries).
  • For each target string, extract its hostname using urlparse if it may be a URL, or treat it as a hostname directly if that is the convention.
  • Then use exact equality checks (==) for host/domain candidates, or use .endswith() with a dot prefix if subdomains should be allowed (.evilcorp.com).
  • This edit should be local to comparison lines in assertions and similar checks.
  • Add an import for urlparse from urllib.parse.
Suggested changeset 1
bbot/test/test_step_1/test_presets.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/bbot/test/test_step_1/test_presets.py b/bbot/test/test_step_1/test_presets.py
--- a/bbot/test/test_step_1/test_presets.py
+++ b/bbot/test/test_step_1/test_presets.py
@@ -350,7 +350,11 @@
     assert "www.evilcorp.com" not in preset_with_target_scope_baked.seeds
     # Target expansion only applies to targets (evilcorp.com), not seeds-only domains.
     assert "www.evilcorp.org" not in preset_with_target_scope_baked.target.target
-    assert "www.evilcorp.com" in preset_with_target_scope_baked.target.target
+    from urllib.parse import urlparse
+    assert any(
+        urlparse(target).hostname == "www.evilcorp.com" or target == "www.evilcorp.com"
+        for target in preset_with_target_scope_baked.target.target
+    )
     # Scope/target checks reflect that only evilcorp.com is in the merged target.
     assert not preset_with_target_scope_baked.in_scope("www.evilcorp.org")
     assert preset_with_target_scope_baked.in_scope("www.evilcorp.com")
EOF
@@ -350,7 +350,11 @@
assert "www.evilcorp.com" not in preset_with_target_scope_baked.seeds
# Target expansion only applies to targets (evilcorp.com), not seeds-only domains.
assert "www.evilcorp.org" not in preset_with_target_scope_baked.target.target
assert "www.evilcorp.com" in preset_with_target_scope_baked.target.target
from urllib.parse import urlparse
assert any(
urlparse(target).hostname == "www.evilcorp.com" or target == "www.evilcorp.com"
for target in preset_with_target_scope_baked.target.target
)
# Scope/target checks reflect that only evilcorp.com is in the merged target.
assert not preset_with_target_scope_baked.in_scope("www.evilcorp.org")
assert preset_with_target_scope_baked.in_scope("www.evilcorp.com")
Copilot is powered by AI and may make mistakes. Always verify output.
Unable to commit as this autofix suggestion is now outdated

# Verify seeds are accessible
assert "seed1.example.com" in scan.target.seeds.inputs, "seed1.example.com should be in seeds"
assert "seed2.example.com" in scan.target.seeds.inputs, "seed2.example.com should be in seeds"

Check failure

Code scanning / CodeQL

Incomplete URL substring sanitization High test

The string
seed2.example.com
may be at an arbitrary position in the sanitized URL.

Copilot Autofix

AI 23 days ago

To fix the problem, we should ensure that membership tests check for exact (canonical) matches rather than substring matches. In Python, foo in bar where bar is a string and foo is a substring will return True if and only if foo appears anywhere within bar, which is dangerous for URLs or FQDNs. However, if bar is a list or set, then it checks for exact matches, which is safe.

Here, the test uses "seed2.example.com" in scan.target.seeds.inputs. If scan.target.seeds.inputs is a string type, we must change this to equality (or, better, to compare against a list/set). If it is a list or set, leave as is, but verify.

Best practice:

  • Ensure scan.target.seeds.inputs is a collection (list, set, etc.). If it is a string, convert it to a collection for precise membership testing.
  • Use equality comparison or set/list membership (in collection) instead of string containment.
  • If the values are URLs, extract the hostname portion before comparison (using urllib.parse).

As we are only provided the test file and not the implementation of inputs, we'll adapt the test to be robust regardless: if scan.target.seeds.inputs is a string, cast to a collection before testing; otherwise, let the test proceed.


Suggested changeset 1
bbot/test/test_step_1/test_scan.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/bbot/test/test_step_1/test_scan.py b/bbot/test/test_step_1/test_scan.py
--- a/bbot/test/test_step_1/test_scan.py
+++ b/bbot/test/test_step_1/test_scan.py
@@ -111,9 +111,13 @@
     assert not scan.in_target("seed2.example.com"), "Seed DNS name should not be in target"
 
     # Verify seeds are accessible
-    assert "seed1.example.com" in scan.target.seeds.inputs, "seed1.example.com should be in seeds"
-    assert "seed2.example.com" in scan.target.seeds.inputs, "seed2.example.com should be in seeds"
-    assert "192.168.1.0/24" not in scan.target.seeds.inputs, (
+    # Ensure robust, explicit membership check for exact match
+    seeds_inputs = scan.target.seeds.inputs
+    if isinstance(seeds_inputs, str):
+        seeds_inputs = [seeds_inputs]
+    assert "seed1.example.com" in seeds_inputs, "seed1.example.com should be in seeds"
+    assert "seed2.example.com" in seeds_inputs, "seed2.example.com should be in seeds"
+    assert "192.168.1.0/24" not in seeds_inputs, (
         "Target should not be in seeds when seeds are explicitly provided"
     )
 
EOF
@@ -111,9 +111,13 @@
assert not scan.in_target("seed2.example.com"), "Seed DNS name should not be in target"

# Verify seeds are accessible
assert "seed1.example.com" in scan.target.seeds.inputs, "seed1.example.com should be in seeds"
assert "seed2.example.com" in scan.target.seeds.inputs, "seed2.example.com should be in seeds"
assert "192.168.1.0/24" not in scan.target.seeds.inputs, (
# Ensure robust, explicit membership check for exact match
seeds_inputs = scan.target.seeds.inputs
if isinstance(seeds_inputs, str):
seeds_inputs = [seeds_inputs]
assert "seed1.example.com" in seeds_inputs, "seed1.example.com should be in seeds"
assert "seed2.example.com" in seeds_inputs, "seed2.example.com should be in seeds"
assert "192.168.1.0/24" not in seeds_inputs, (
"Target should not be in seeds when seeds are explicitly provided"
)

Copilot is powered by AI and may make mistakes. Always verify output.
Unable to commit as this autofix suggestion is now outdated
assert "8.8.8.8/29" not in scan1.target.target
assert "2001:4860:4860::8889" in scan1.target.target
assert "2001:4860:4860::888c" not in scan1.target.target
assert "www.api.publicapis.org" in scan1.target.target

Check failure

Code scanning / CodeQL

Incomplete URL substring sanitization High test

The string
www.api.publicapis.org
may be at an arbitrary position in the sanitized URL.

Copilot Autofix

AI 23 days ago

Copilot could not generate an autofix suggestion

Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.

assert "2001:4860:4860::8889" in scan1.target.target
assert "2001:4860:4860::888c" not in scan1.target.target
assert "www.api.publicapis.org" in scan1.target.target
assert "api.publicapis.org" in scan1.target.target

Check failure

Code scanning / CodeQL

Incomplete URL substring sanitization High test

The string
api.publicapis.org
may be at an arbitrary position in the sanitized URL.

Copilot Autofix

AI 23 days ago

The fix involves parsing the URLs (or targets) in scan1.target.target and validating that one of them has a host equal to (or, if desired, ending with) "api.publicapis.org". This can be done using Python's urllib.parse.urlparse to extract the hostname from each string. To maintain the original test's semantics, which seem to expect "api.publicapis.org" to be present as an atomic host in at least one entry, the check should parse all targets, extract their hostnames, and assert that "api.publicapis.org" is present in the set of hostnames (not in the full string arbitrarily). The change is confined to line 68, possibly adding helper code to parse and check hostnames. An appropriate import for urlparse should be added, since it's not present in the context provided.

Suggested changeset 1
bbot/test/test_step_1/test_target.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/bbot/test/test_step_1/test_target.py b/bbot/test/test_step_1/test_target.py
--- a/bbot/test/test_step_1/test_target.py
+++ b/bbot/test/test_step_1/test_target.py
@@ -6,6 +6,7 @@
     from radixtarget import RadixTarget
     from ipaddress import ip_address, ip_network
     from bbot.scanner.target import BBOTTarget, ScanSeeds
+    from urllib.parse import urlparse
 
     scan1 = bbot_scanner("api.publicapis.org", "8.8.8.8/30", "2001:4860:4860::8888/126")
     scan2 = bbot_scanner("8.8.8.8/29", "publicapis.org", "2001:4860:4860::8888/125")
@@ -65,7 +66,10 @@
     assert "2001:4860:4860::8889" in scan1.target.target
     assert "2001:4860:4860::888c" not in scan1.target.target
     assert "www.api.publicapis.org" in scan1.target.target
-    assert "api.publicapis.org" in scan1.target.target
+    assert any(
+        (urlparse(t).hostname == "api.publicapis.org" if "://" in t else t == "api.publicapis.org")
+        for t in scan1.target.target
+    )
     assert "publicapis.org" not in scan1.target.target
     assert "bob@www.api.publicapis.org" in scan1.target.target
     assert "https://www.api.publicapis.org" in scan1.target.target
EOF
@@ -6,6 +6,7 @@
from radixtarget import RadixTarget
from ipaddress import ip_address, ip_network
from bbot.scanner.target import BBOTTarget, ScanSeeds
from urllib.parse import urlparse

scan1 = bbot_scanner("api.publicapis.org", "8.8.8.8/30", "2001:4860:4860::8888/126")
scan2 = bbot_scanner("8.8.8.8/29", "publicapis.org", "2001:4860:4860::8888/125")
@@ -65,7 +66,10 @@
assert "2001:4860:4860::8889" in scan1.target.target
assert "2001:4860:4860::888c" not in scan1.target.target
assert "www.api.publicapis.org" in scan1.target.target
assert "api.publicapis.org" in scan1.target.target
assert any(
(urlparse(t).hostname == "api.publicapis.org" if "://" in t else t == "api.publicapis.org")
for t in scan1.target.target
)
assert "publicapis.org" not in scan1.target.target
assert "bob@www.api.publicapis.org" in scan1.target.target
assert "https://www.api.publicapis.org" in scan1.target.target
Copilot is powered by AI and may make mistakes. Always verify output.
Unable to commit as this autofix suggestion is now outdated
assert "api.publicapis.org" in scan1.target.target
assert "publicapis.org" not in scan1.target.target
assert "bob@www.api.publicapis.org" in scan1.target.target
assert "https://www.api.publicapis.org" in scan1.target.target

Check failure

Code scanning / CodeQL

Incomplete URL substring sanitization High test

The string
https://www.api.publicapis.org
may be at an arbitrary position in the sanitized URL.

Copilot Autofix

AI 23 days ago

Copilot could not generate an autofix suggestion

Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.

assert "publicapis.org" not in scan1.target.target
assert "bob@www.api.publicapis.org" in scan1.target.target
assert "https://www.api.publicapis.org" in scan1.target.target
assert "www.api.publicapis.org:80" in scan1.target.target

Check failure

Code scanning / CodeQL

Incomplete URL substring sanitization High test

The string
www.api.publicapis.org:80
may be at an arbitrary position in the sanitized URL.

Copilot Autofix

AI 23 days ago

The problem is that the assertion checks whether "www.api.publicapis.org:80" is a substring of scan1.target.target. This only guarantees that somewhere, possibly as a substring or embedded in a larger string, the value exists. If scan1.target.target is a list/set of URLs or strings, we should check whether it contains a host and port matching the intended one, not just a substring.

To fix this, parse the target collection (likely a list/set of addresses/URLs), iterate over its elements, and, for each entry, parse it as a URL or split into host and port as appropriate. Check for exact match—i.e., is there an entry whose host is "www.api.publicapis.org" and port is "80" (or which matches the tuple "www.api.publicapis.org:80"), instead of relying on substring search.

This may require importing Python's urllib.parse for proper URL decomposition, or, if scan1.target.target is a collection of strings, a simple loop with parsed matching. The assertion should be changed from:

assert "www.api.publicapis.org:80" in scan1.target.target

to logic that loops over the target set and does an exact match on hostname and port (after parsing). If the entry is not a full URL but a "host:port" string, split on the first colon and compare.


Suggested changeset 1
bbot/test/test_step_1/test_target.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/bbot/test/test_step_1/test_target.py b/bbot/test/test_step_1/test_target.py
--- a/bbot/test/test_step_1/test_target.py
+++ b/bbot/test/test_step_1/test_target.py
@@ -69,7 +69,36 @@
     assert "publicapis.org" not in scan1.target.target
     assert "bob@www.api.publicapis.org" in scan1.target.target
     assert "https://www.api.publicapis.org" in scan1.target.target
-    assert "www.api.publicapis.org:80" in scan1.target.target
+    # Ensure that www.api.publicapis.org:80 is present as a full host:port, not just as a substring
+    from urllib.parse import urlparse
+    def has_host_port(targets, host, port):
+        for item in targets:
+            parsed = urlparse(item)
+            # Handle cases like raw host:port (e.g., "[2001:4860:4860::8888]:80")
+            # or full URL (e.g., "https://www.api.publicapis.org:80" or "www.api.publicapis.org:80")
+            # first, check if scheme is present
+            if parsed.scheme:
+                netloc = parsed.netloc
+            else:
+                netloc = parsed.path
+            # Remove possible leading/trailing brackets/spaces
+            netloc = netloc.strip(" []")
+            # Split for IPv6 addresses or host:port
+            if netloc.count(":") > 1:  # IPv6
+                # e.g. [2001:4860:4860::8888]:80
+                if netloc.startswith("[") and "]:" in netloc:
+                    addr, port_part = netloc[1:].split("]:", 1)
+                    if addr == host and port_part == port:
+                        return True
+            else:
+                if ":" in netloc:
+                    host_part, port_part = netloc.rsplit(":", 1)
+                    if host_part == host and port_part == port:
+                        return True
+                elif netloc == host and (parsed.port == int(port) if parsed.port else False):
+                    return True
+        return False
+    assert has_host_port(scan1.target.target, "www.api.publicapis.org", "80")
     assert scan1.make_event("https://[2001:4860:4860::8888]:80", dummy=True) in scan1.target.target
     assert scan1.make_event("[2001:4860:4860::8888]:80", "OPEN_TCP_PORT", dummy=True) in scan1.target.target
     assert scan1.make_event("[2001:4860:4860::888c]:80", "OPEN_TCP_PORT", dummy=True) not in scan1.target.target
EOF
@@ -69,7 +69,36 @@
assert "publicapis.org" not in scan1.target.target
assert "bob@www.api.publicapis.org" in scan1.target.target
assert "https://www.api.publicapis.org" in scan1.target.target
assert "www.api.publicapis.org:80" in scan1.target.target
# Ensure that www.api.publicapis.org:80 is present as a full host:port, not just as a substring
from urllib.parse import urlparse
def has_host_port(targets, host, port):
for item in targets:
parsed = urlparse(item)
# Handle cases like raw host:port (e.g., "[2001:4860:4860::8888]:80")
# or full URL (e.g., "https://www.api.publicapis.org:80" or "www.api.publicapis.org:80")
# first, check if scheme is present
if parsed.scheme:
netloc = parsed.netloc
else:
netloc = parsed.path
# Remove possible leading/trailing brackets/spaces
netloc = netloc.strip(" []")
# Split for IPv6 addresses or host:port
if netloc.count(":") > 1: # IPv6
# e.g. [2001:4860:4860::8888]:80
if netloc.startswith("[") and "]:" in netloc:
addr, port_part = netloc[1:].split("]:", 1)
if addr == host and port_part == port:
return True
else:
if ":" in netloc:
host_part, port_part = netloc.rsplit(":", 1)
if host_part == host and port_part == port:
return True
elif netloc == host and (parsed.port == int(port) if parsed.port else False):
return True
return False
assert has_host_port(scan1.target.target, "www.api.publicapis.org", "80")
assert scan1.make_event("https://[2001:4860:4860::8888]:80", dummy=True) in scan1.target.target
assert scan1.make_event("[2001:4860:4860::8888]:80", "OPEN_TCP_PORT", dummy=True) in scan1.target.target
assert scan1.make_event("[2001:4860:4860::888c]:80", "OPEN_TCP_PORT", dummy=True) not in scan1.target.target
Copilot is powered by AI and may make mistakes. Always verify output.
Unable to commit as this autofix suggestion is now outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants