fix(dns): Fix response buffer race condition in concurrent queries#1054
fix(dns): Fix response buffer race condition in concurrent queries#1054abhik-roy85 wants to merge 3 commits into
Conversation
7a1adb3 to
f2b75d9
Compare
f2b75d9 to
ec6a7b6
Compare
euripedesrocha
left a comment
There was a problem hiding this comment.
I did a partial review, but there are some discussions that might lead to structural changes.
| #define ESP_DNS_BUFFER_SIZE 512 | ||
| /** Maximum number of resource records to scan in a response */ | ||
| #ifndef ESP_DNS_MAX_ANSWER_SCAN | ||
| #define ESP_DNS_MAX_ANSWER_SCAN 16 |
There was a problem hiding this comment.
Should this be controlled by the user?
| err_t status; /* Status of the answer */ | ||
| ip_addr_t ip; /* IP address from the answer */ | ||
| } dns_answer_storage_t; | ||
| #define ESP_DNS_BUFFER_SIZE 512 |
There was a problem hiding this comment.
Could be configurable as well.
| * @param response The DNS response to process. | ||
| * @param ipaddr An array to store the extracted IP addresses. | ||
| * @param response The parsed DNS response | ||
| * @param ipaddr Array to store the copied IP addresses |
There was a problem hiding this comment.
Isn't better to have user to pass max number of entries?
| uint16_t id; /* Transaction ID */ | ||
| int num_answers; /* Number of valid answers */ | ||
| dns_answer_storage_t answers[MAX_ANSWERS]; /* Array of answers */ | ||
| ip_addr_t answers[MAX_ANSWERS]; /* Array of IP addresses */ |
There was a problem hiding this comment.
Is this a breaking change? I think we are under 1.0, but it is worth to guide users on this.
| - **Certificate Errors**: | ||
| - Verify that the correct certificate is provided for secure protocols | ||
| - For public DNS servers, use the certificate bundle approach | ||
| - If you see `No matching trusted root certificate found` when using the certificate bundle (e.g. with `dns.google` or Cloudflare), the server likely uses a cross-signed chain. You must enable `CONFIG_MBEDTLS_CERTIFICATE_BUNDLE_CROSS_SIGNED_VERIFY=y` in your `sdkconfig` (under Component config → mbedTLS → Certificate Bundle → Support cross-signed certificate verification). |
There was a problem hiding this comment.
How likely is this to happen? Should we have this set by default and let user to opt-out instead?
|
|
||
| /* ESP-IDF xTaskCreate() stack size is in bytes. DoT/DoH paths pull in mbedTLS inside | ||
| * getaddrinfo(); 4 KiB is too small and triggers a stack overflow on typical builds. */ | ||
| #define DNS_ADDRINFO_TASK_STACK 8192 |
There was a problem hiding this comment.
Make it configurable?
|
|
||
| * **Certificate Issues**: | ||
| For DoT and DoH protocols, ensure that the certificates are valid for the DNS server you're using. The example includes Google DNS certificates, but these may need to be updated if they expire. | ||
| If you are using the Certificate Bundle and see a `No matching trusted root certificate found` error, it is likely due to cross-signed chains (e.g., from Google). Make sure that `CONFIG_MBEDTLS_CERTIFICATE_BUNDLE_CROSS_SIGNED_VERIFY=y` is enabled in your configuration. This can be enabled via `idf.py menuconfig` -> `Component config` -> `mbedTLS` -> `Certificate Bundle` -> `Support cross-signed certificate verification`. |
There was a problem hiding this comment.
Should it be set as default and let user opt out?
| | Supported Targets | ESP32 | ESP32-C2 | ESP32-C3 | ESP32-C5 | ESP32-C6 | ESP32-C61 | ESP32-H2 | ESP32-P4 | ESP32-S2 | ESP32-S3 | | ||
| | ----------------- | ----- | -------- | -------- | -------- | -------- | --------- | -------- | -------- | -------- | -------- | | ||
|
|
||
| This example stress-tests **`getaddrinfo()`** while the **esp_dns** component owns the lwIP **external resolve** hook. It is meant to surface concurrency bugs documented in [ISSUES_AND_PROGRESS.md](../../ISSUES_AND_PROGRESS.md): |
There was a problem hiding this comment.
Remove the reference of non-committed file.
| @@ -0,0 +1,49 @@ | |||
| # esp_dns concurrent resolver test | |||
There was a problem hiding this comment.
If this is a test app, should be treated as one. Not on the examples.
|
|
||
| This example stress-tests **`getaddrinfo()`** while the **esp_dns** component owns the lwIP **external resolve** hook. It is meant to surface concurrency bugs documented in [ISSUES_AND_PROGRESS.md](../../ISSUES_AND_PROGRESS.md): | ||
|
|
||
| - **Issue 1** — shared singleton resolver state (`response_buffer`, transaction id) under concurrent lookups. |
There was a problem hiding this comment.
Should we review the architecture and adopt a queue to the requests? This way we would fix the issu by removing the singleton.
Moved `response_buffer` out of the global handle context (`esp_dns_handle`) and into automatic/local storage within `dns_resolve_doh`, `dns_resolve_dot`, and `dns_resolve_tcp`. This prevents memory corruption and race conditions when multiple DNS queries are running concurrently. Also included `esp_dns_concurrent_test` example to validate concurrent lookup behavior.
- Simplified `dns_response_t` to store `ip_addr_t` directly instead of a nested struct. - Renamed `esp_dns_extract_ip_addresses_from_response` to `esp_dns_get_ips_from_response` and optimized the answer scanning limits. - Migrated `esp_dns_basic` and `esp_dns_concurrent_test` examples from `protocol_examples_common` to `net_connect`. - Increased the `addr_info_task` stack size in `esp_dns_basic` to 8192 bytes to prevent stack overflows during DoT/DoH mbedTLS operations. - Enabled `CONFIG_MBEDTLS_CERTIFICATE_BUNDLE_CROSS_SIGNED_VERIFY` in example `sdkconfig.defaults` to fix validation errors with cross-signed certificate chains (e.g., Google, Cloudflare). - Updated READMEs with troubleshooting steps for cross-signed certificate errors.
…494) Fix ESP-IDF-ESP_DNS-DOS-001 by making esp_dns response parsing strictly bounds-safe. A malicious DNS response could previously cause an infinite loop or read past the end of the receive buffer. - skip_dns_name(): use size_t offset to prevent wraparound, and check bounds before reading labels and compression pointers - esp_dns_parse_response(): reject responses shorter than a DNS header, track buffer_end, guard pointer arithmetic, and validate answer record headers and RR data before memcpy - TCP/DoT/DoH callers: require a minimum response size and verify the RFC 7858 length prefix before parsing
ec6a7b6 to
135fa98
Compare
Description
Moved
response_bufferout of the global handle context (esp_dns_handle) and into automatic/local storage withindns_resolve_doh,dns_resolve_dot, anddns_resolve_tcp. This prevents memory corruption and race conditions when multiple DNS queries are running concurrently.Also included
esp_dns_concurrent_testexample to validate concurrent lookup behavior.Related
Testing
Checklist
Before submitting a Pull Request, please ensure the following: