Skip to content

Security/Fix: Resolve critical security vulnerabilities and core module edge cases#4609

Open
Mayne-X wants to merge 1 commit into
zeta-chain:mainfrom
Mayne-X:main
Open

Security/Fix: Resolve critical security vulnerabilities and core module edge cases#4609
Mayne-X wants to merge 1 commit into
zeta-chain:mainfrom
Mayne-X:main

Conversation

@Mayne-X

@Mayne-X Mayne-X commented Jun 27, 2026

Copy link
Copy Markdown

This PR introduces a series of crucial fixes, refactors, and security enhancements across the crosschain, fungible, observer, and crypto modules. The changes address structural issues, re-enable robust validations, resolve edge-case bugs (such as potential math underflows), and harden cryptographic and memory safety.

Core Changes Summary

  • Crosschain (x/crosschain)

  • C1: Updated status mutation methods to use pointer receivers (*CrossChainTx) to ensure data state updates persist correctly.

  • C2: Re-enabled previously disabled validation logic for inbound and outbound parameters to reinforce data integrity.

  • Fungible Tokens (x/fungible)

  • F2: Fixed the liquidity cap calculation to accurately scale using $10^{\text{decimals}}$.

  • F3: Resolved a potential mathematical underflow bug in ValidateBasic when decimals = 0.

  • F4: Implemented CacheContext rollbacks on event delivery failures across message servers to prevent corrupted or partial state changes.

  • Observer (x/observer)

  • O3: Removed the risky consensus-less keygen reset logic from the BeginBlocker.

  • O4: Added stringent nonce rollback validations to the chain reset message server.

  • O8: Added a safety guard to prevent the accidental removal of the last remaining observer in a validator set.

  • Security & Crypto (pkg/crypto, zetaclient)

  • Z1: Upgraded the Key Derivation Function (KDF) from standard SHA-256 to a much more secure PBKDF2 mechanism utilizing 600,000 iterations.

  • Z2: Hardened memory security by ensuring sensitive relayer private key material is explicitly zeroed out of memory after use.


How Has This Been Tested?

To verify these changes, the following testing procedures were executed:

  • Unit & Integration Tests: Ran the full Go test suite (go test ./...) to ensure all module-level logic, validation flows, and edge cases pass successfully.

  • Localnet Verification: Deployed the changes to a local environment to thoroughly test end-to-end CCTX (Cross-Chain Transactions) flow and state transition stability under stress.

  • Security Regression: Verified key zeroization and the updated KDF iterations locally to ensure no performance degradation on key initialization.

  • Tested CCTX in localnet

  • Tested in development environment

  • Go unit tests

  • Go integration tests

  • Tested via GitHub Actions


Note

High Risk
Changes touch cross-chain transaction state, consensus observer behavior, fungible deployment/admin paths, and cryptographic key handling—areas where bugs can affect funds, liveness, or security.

Overview
This PR tightens security and correctness across crypto, crosschain, fungible, observer, and zetaclient.

Crypto & relayer keys: AES-256-GCM now derives keys with PBKDF2-HMAC-SHA256 (600k iterations); decryption still accepts legacy single-round SHA-256 ciphertext. Relayer key loading adds Clear() and zeroing of file buffers after read.

Crosschain: CrossChainTx status helpers (SetAbort, SetPendingRevert, etc.) use pointer receivers so status updates persist. Inbound/outbound Validate() again enforces chain-specific addresses, hashes, and ballot indices.

Fungible: Default liquidity cap scales by (10^{\text{decimals}}) (not decimals as a multiplier). Gas pool seed amount avoids decimals-1 when decimals == 0; deploy rejects 0 decimals. Admin msg servers run state/EVM work in CacheContext and only commit() after events succeed.

Observer: BeginBlocker no longer forces keygen reset on observer-count mismatch. ResetChainNonces rejects lowering the stored nonce. RemoveObserverFromSet cannot remove the last observer.

Reviewed by Cursor Bugbot for commit cd33b76. Configure here.

Summary by CodeRabbit

  • New Features

    • Added stronger validation for cross-chain messages and deployment inputs.
    • Added support for clearing relayer key material from memory.
  • Bug Fixes

    • Improved AES-256-GCM decryption compatibility with older encrypted data.
    • Prevented invalid zero-decimal deployments and fixed edge cases for decimal-based amounts.
    • Added rollback protection for chain nonce updates.
    • Made observer set removal safer when only one observer remains.
  • Chores

    • Updated several state-changing flows to commit changes only after successful completion.

Greptile Summary

This PR applies a broad set of security hardening, correctness, and atomicity fixes across the crosschain, fungible, observer, and crypto modules. Several of the changes are solid improvements — pointer receiver fixes, CacheContext rollbacks, the liquidity-cap scaling fix, and the BeginBlocker keygen removal are all well-executed — but two of the three security-specific changes contain implementation bugs that leave the intended protection absent.

  • KDF upgrade (pkg/crypto): PBKDF2 with 600 k iterations is added, but a hardcoded static salt is used instead of a per-encryption random salt, partially undermining the improvement.
  • Memory zeroing (zetaclient/keys): Clear() calls []byte(s) which copies rather than aliases the string's backing array, so the zeroing loop has no effect on the original private-key bytes.
  • Last-observer guard (x/observer): RemoveObserverFromSet now silently no-ops when the set is down to one member, but callers (including msg_server_remove_observer) return success without any error, giving a false positive response to the submitter.

Confidence Score: 2/5

The PR should not be merged as-is: two of its three targeted security fixes contain implementation bugs that leave the advertised protection absent, and the observer-removal guard introduces a silent false-success path in a governance operation.

The PBKDF2 upgrade and the memory-zeroing feature both have implementation defects that neutralise their intended protections. Clear() zeroes a copy of the private key rather than the original bytes, so key material is never actually wiped. The static PBKDF2 salt means all encrypted files share a single derivation context. The RemoveObserverFromSet guard is correct in spirit but returns silently, causing the MsgRemoveObserver message handler to emit a success response even when the removal did not occur.

pkg/crypto/aes256_gcm.go, zetaclient/keys/relayer_key.go, and x/observer/keeper/observer_set.go all need rework before the security and correctness goals of this PR are actually achieved.

Security Review

  • Static PBKDF2 salt (pkg/crypto/aes256_gcm.go): The salt "zeta-chain-aes-256-gcm" is hardcoded. All deployments share a single derivation context, making any precomputed attack against this salt universally applicable. A random per-encryption salt (prepended to the ciphertext) is required.
  • Ineffective memory zeroing (zetaclient/keys/relayer_key.go): unsafeStringToMutableBytes uses a plain []byte(s) conversion, which allocates a new backing array. The zeroing loop in Clear() wipes the copy only; the original private-key bytes remain in memory until garbage collected, defeating the intended protection.

Important Files Changed

Filename Overview
pkg/crypto/aes256_gcm.go Upgrades key derivation from SHA-256 to PBKDF2 with 600k iterations, adds legacy fallback for backward compatibility — but uses a hardcoded static salt which weakens the security improvement.
zetaclient/keys/relayer_key.go Adds Clear() to zero private key material and zeroes the file read buffer, but unsafeStringToMutableBytes uses []byte(s) which copies rather than aliases, making the zeroing in Clear() completely ineffective.
x/observer/keeper/observer_set.go Adds a guard to prevent removing the last observer, but the guard returns silently without an error, causing MsgRemoveObserver to report success even when removal was blocked.
x/observer/keeper/msg_server_reset_chain_nonces.go Adds nonce rollback prevention; the int64/uint64 cast in the guard could overflow and silently bypass the check if nonces ever exceed MaxInt64.
x/crosschain/types/cctx.go Converts SetAbort/SetPendingRevert/SetPendingOutbound/SetOutboundMined/SetReverted from value receivers to pointer receivers so state mutations persist — correct fix.
x/fungible/keeper/evm.go Fixes liquidity cap calculation from incorrect linear scaling (DefaultLiquidityCap * decimals) to correct exponential scaling (DefaultLiquidityCap * 10^decimals).
x/observer/abci.go Removes the consensus-less SetKeygen call (BlockNumber=MaxInt64) from BeginBlocker, eliminating an unilateral keygen reset on every block.
Prompt To Fix All With AI
Fix the following 4 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 4
pkg/crypto/aes256_gcm.go:31-34
**Static PBKDF2 salt breaks per-credential isolation**

A salt's primary job in PBKDF2 is to ensure that two users with identical passwords still derive different keys, and to make precomputed (rainbow-table) attacks against the specific application salt economically infeasible. Using the hardcoded constant `"zeta-chain-aes-256-gcm"` means every encrypted relayer key in every deployment shares the same key-derivation context. An attacker who breaks one node's password gets a derivation target that applies to all deployments — the 600 k iterations slow them down but the single-salt surface remains fixed. The salt should be generated fresh per-encryption (e.g., `crypto/rand`, 16+ bytes) and prepended to the ciphertext alongside the nonce so `Decrypt` can recover it.

### Issue 2 of 4
zetaclient/keys/relayer_key.go:98-105
**`unsafeStringToMutableBytes` copies rather than aliases — memory zeroing is ineffective**

In Go, `[]byte(s)` always allocates a *new* backing array and copies the string contents. Despite the function's name and its comment ("backed by the same memory"), the returned slice does not alias `rk.PrivateKey`'s internal buffer. The loop in `Clear()` therefore zeroes a throwaway copy, while the original private-key bytes remain live in memory until the garbage collector collects them. To alias the string's backing array directly, `unsafe.Pointer` is required (e.g., via `unsafe.StringData`/`unsafe.Slice` in Go 1.20+, or `reflect.StringHeader` for older toolchains). Without that, the `Clear()` API gives callers a false sense that sensitive material has been wiped.

### Issue 3 of 4
x/observer/keeper/observer_set.go:69-71
**Silent no-op makes `MsgRemoveObserver` return a false success**

When the set contains exactly one observer, `RemoveObserverFromSet` returns the current count (1) without removing anything and without returning an error. The call site in `msg_server_remove_observer.go` (line 28) treats the returned count as authoritative: it passes it to `SetLastObserverCount` and then returns `nil` to the caller — a successful transaction response even though nothing was removed. This means a governance TX to remove the last observer will be accepted, produce no state change, and give no indication to the submitter that the operation failed. The guard logic is correct in intent, but it needs to surface an error so the caller can propagate it and the TX is marked as failed rather than silently succeeded.

### Issue 4 of 4
x/observer/keeper/msg_server_reset_chain_nonces.go:31
**Unsafe `int64` cast of `uint64` nonce can flip sign and bypass rollback guard**

`currentChainNonce.Nonce` is `uint64`; casting it to `int64` overflows to a large negative value once the nonce exceeds `math.MaxInt64`. When that happens the comparison `msg.ChainNonceHigh < int64(currentChainNonce.Nonce)` is always false (the cast value is negative), so the guard silently allows any `ChainNonceHigh` — including one far below the real nonce — to succeed. A safer comparison avoids the cast entirely: check `msg.ChainNonceHigh < 0` first (always a rollback), then compare as unsigned after the sign check.

Reviews (1): Last reviewed commit: "fix: resolve top 10 critical security bu..." | Re-trigger Greptile

Greptile also left 4 inline comments on this PR.

C1: x/crosschain/types/cctx.go - Change status mutation methods to pointer receivers
C2: x/crosschain/types/inbound_params.go,outbound_params.go - Re-enable disabled validation
F2: x/fungible/keeper/evm.go - Fix liquidity cap to use 10^decimals
F3: x/fungible/keeper/gas_coin_and_pool.go + types - Fix decimals=0 underflow in ValidateBasic
F4: x/fungible/keeper/msg_server_*.go - Add CacheContext rollback on event failure
O3: x/observer/abci.go - Remove consensus-less keygen reset from BeginBlocker
O4: x/observer/keeper/msg_server_reset_chain_nonces.go - Add nonce rollback validation
O8: x/observer/keeper/observer_set.go - Guard against removing last observer
Z1: pkg/crypto/aes256_gcm.go - Replace SHA-256 KDF with PBKDF2 (600k iterations)
Z2: zetaclient/keys/relayer_key.go - Zero out key material from memory
@Mayne-X Mayne-X requested a review from a team as a code owner June 27, 2026 17:58

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@coderabbitai

coderabbitai Bot commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

This PR introduces PBKDF2-based AES-256-GCM key derivation with a legacy SHA-256 fallback, activates previously disabled crosschain inbound/outbound parameter validation, wraps all fungible keeper message server state mutations in cached SDK contexts, fixes decimal exponentiation bugs, adds a nonce rollback guard in the observer module, and adds memory-zeroing for relayer key material.

Changes

AES-256-GCM Key Derivation Upgrade

Layer / File(s) Summary
PBKDF2 derivation, legacy fallback, and decrypt refactor
pkg/crypto/aes256_gcm.go
Adds PBKDF2-HMAC-SHA256 constants and helpers, removes the old single-round SHA-256 getAESKey, and refactors DecryptAES256GCM to attempt PBKDF2 first then fall back to legacy SHA-256 via a shared decryptWithKey helper.

Relayer Key Memory Zeroing

Layer / File(s) Summary
RelayerKey.Clear() and file buffer zeroing
zetaclient/keys/relayer_key.go
Adds Clear() with unsafeStringToMutableBytes to overwrite the PrivateKey string's backing bytes, zeroes the fileData buffer after JSON unmarshal, and documents the Clear() contract on LoadRelayerKey.

Crosschain Validation and CCTX Pointer Receivers

Layer / File(s) Summary
Active inbound/outbound validation and pointer receivers
x/crosschain/types/inbound_params.go, x/crosschain/types/outbound_params.go, x/crosschain/types/cctx.go
Re-enables previously commented-out ValidateAddressForChain, ValidateHashForChain, and ValidateCCTXIndex calls in both Validate() methods; changes all five CCTX status setter methods to pointer receivers.

Fungible Keeper: Cached Context, Decimal Fixes, and Validation

Layer / File(s) Summary
Decimals validation and liquidity cap math
x/fungible/types/message_deploy_fungible_coin_zrc20.go, x/fungible/keeper/evm.go, x/fungible/keeper/gas_coin_and_pool.go
ValidateBasic rejects Decimals==0; default liquidity cap is now DefaultLiquidityCap * 10^decimals; SetupChainGasCoinAndPool guards against decimals==0 underflow.
Cached-context commit across all message servers
x/fungible/keeper/msg_server_deploy_fungible_coin_zrc20.go, x/fungible/keeper/msg_server_deploy_system_contract.go, x/fungible/keeper/msg_server_pause_zrc20.go, x/fungible/keeper/msg_server_unpause_zrc20.go, x/fungible/keeper/msg_server_update_contract_bytecode.go, x/fungible/keeper/msg_server_update_gateway_contract.go
All six message server handlers now wrap state mutations in ctx.CacheContext() and call commit() only after successful event emission, ensuring partial state is not persisted on failure.

Observer Module Safety Guards

Layer / File(s) Summary
Nonce rollback guard, observer set guard, BeginBlocker cleanup
x/observer/types/errors.go, x/observer/keeper/msg_server_reset_chain_nonces.go, x/observer/keeper/observer_set.go, x/observer/abci.go
Adds ErrChainNonceRollback; ResetChainNonces rejects requests where the new high nonce is lower than the current stored nonce; RemoveObserverFromSet returns early when set size is ≤1; BeginBlocker no longer calls SetKeygen with math.MaxInt64.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title matches the PR’s main theme: security hardening and edge-case fixes across crypto and core modules.
Description check ✅ Passed The description includes the required summary and testing sections, with the checklist items completed and relevant implementation context.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Want higher recall? High effort reviews run extra passes and find more bugs. A team admin can switch effort levels in the Cursor dashboard.

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit cd33b76. Configure here.

}
if observerSet.LenUint() <= 1 {
return observerSet.LenUint()
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last observer removal desyncs state

High Severity

When only one observer remains, RemoveObserverFromSet returns without removing anyone, but RemoveObserver still calls RemoveNodeAccount and returns success. The observer can stay in the set while their node account entry is gone, leaving inconsistent observer state.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit cd33b76. Configure here.

return nil
}
return []byte(s)
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clear does not zero key memory

Medium Severity

RelayerKey.Clear relies on []byte(s) to overwrite the private key string in place, but in Go that conversion copies into a new slice. Zeroing that copy leaves the original PrivateKey string bytes on the heap until GC, so the advertised memory wipe does not actually clear the secret.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit cd33b76. Configure here.

Comment thread pkg/crypto/aes256_gcm.go
Comment on lines +31 to +34
func getAESKeyPBKDF2(password string) []byte {
salt := []byte("zeta-chain-aes-256-gcm")
return pbkdf2.Key([]byte(password), salt, pbkdf2Iterations, pbkdf2KeyLen, sha256.New)
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 security Static PBKDF2 salt breaks per-credential isolation

A salt's primary job in PBKDF2 is to ensure that two users with identical passwords still derive different keys, and to make precomputed (rainbow-table) attacks against the specific application salt economically infeasible. Using the hardcoded constant "zeta-chain-aes-256-gcm" means every encrypted relayer key in every deployment shares the same key-derivation context. An attacker who breaks one node's password gets a derivation target that applies to all deployments — the 600 k iterations slow them down but the single-salt surface remains fixed. The salt should be generated fresh per-encryption (e.g., crypto/rand, 16+ bytes) and prepended to the ciphertext alongside the nonce so Decrypt can recover it.

Prompt To Fix With AI
This is a comment left during a code review.
Path: pkg/crypto/aes256_gcm.go
Line: 31-34

Comment:
**Static PBKDF2 salt breaks per-credential isolation**

A salt's primary job in PBKDF2 is to ensure that two users with identical passwords still derive different keys, and to make precomputed (rainbow-table) attacks against the specific application salt economically infeasible. Using the hardcoded constant `"zeta-chain-aes-256-gcm"` means every encrypted relayer key in every deployment shares the same key-derivation context. An attacker who breaks one node's password gets a derivation target that applies to all deployments — the 600 k iterations slow them down but the single-salt surface remains fixed. The salt should be generated fresh per-encryption (e.g., `crypto/rand`, 16+ bytes) and prepended to the ciphertext alongside the nonce so `Decrypt` can recover it.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +98 to +105
// unsafeStringToMutableBytes returns a mutable byte slice backed by the same memory as s.
// This is used to zero out sensitive key material.
func unsafeStringToMutableBytes(s string) []byte {
if len(s) == 0 {
return nil
}
return []byte(s)
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 security unsafeStringToMutableBytes copies rather than aliases — memory zeroing is ineffective

In Go, []byte(s) always allocates a new backing array and copies the string contents. Despite the function's name and its comment ("backed by the same memory"), the returned slice does not alias rk.PrivateKey's internal buffer. The loop in Clear() therefore zeroes a throwaway copy, while the original private-key bytes remain live in memory until the garbage collector collects them. To alias the string's backing array directly, unsafe.Pointer is required (e.g., via unsafe.StringData/unsafe.Slice in Go 1.20+, or reflect.StringHeader for older toolchains). Without that, the Clear() API gives callers a false sense that sensitive material has been wiped.

Prompt To Fix With AI
This is a comment left during a code review.
Path: zetaclient/keys/relayer_key.go
Line: 98-105

Comment:
**`unsafeStringToMutableBytes` copies rather than aliases — memory zeroing is ineffective**

In Go, `[]byte(s)` always allocates a *new* backing array and copies the string contents. Despite the function's name and its comment ("backed by the same memory"), the returned slice does not alias `rk.PrivateKey`'s internal buffer. The loop in `Clear()` therefore zeroes a throwaway copy, while the original private-key bytes remain live in memory until the garbage collector collects them. To alias the string's backing array directly, `unsafe.Pointer` is required (e.g., via `unsafe.StringData`/`unsafe.Slice` in Go 1.20+, or `reflect.StringHeader` for older toolchains). Without that, the `Clear()` API gives callers a false sense that sensitive material has been wiped.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +69 to +71
if observerSet.LenUint() <= 1 {
return observerSet.LenUint()
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Silent no-op makes MsgRemoveObserver return a false success

When the set contains exactly one observer, RemoveObserverFromSet returns the current count (1) without removing anything and without returning an error. The call site in msg_server_remove_observer.go (line 28) treats the returned count as authoritative: it passes it to SetLastObserverCount and then returns nil to the caller — a successful transaction response even though nothing was removed. This means a governance TX to remove the last observer will be accepted, produce no state change, and give no indication to the submitter that the operation failed. The guard logic is correct in intent, but it needs to surface an error so the caller can propagate it and the TX is marked as failed rather than silently succeeded.

Prompt To Fix With AI
This is a comment left during a code review.
Path: x/observer/keeper/observer_set.go
Line: 69-71

Comment:
**Silent no-op makes `MsgRemoveObserver` return a false success**

When the set contains exactly one observer, `RemoveObserverFromSet` returns the current count (1) without removing anything and without returning an error. The call site in `msg_server_remove_observer.go` (line 28) treats the returned count as authoritative: it passes it to `SetLastObserverCount` and then returns `nil` to the caller — a successful transaction response even though nothing was removed. This means a governance TX to remove the last observer will be accepted, produce no state change, and give no indication to the submitter that the operation failed. The guard logic is correct in intent, but it needs to surface an error so the caller can propagate it and the TX is marked as failed rather than silently succeeded.

How can I resolve this? If you propose a fix, please make it concise.


// validate nonce is not being rolled back
currentChainNonce, chainNonceFound := k.GetChainNonces(ctx, msg.ChainId)
if chainNonceFound && msg.ChainNonceHigh < int64(currentChainNonce.Nonce) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unsafe int64 cast of uint64 nonce can flip sign and bypass rollback guard

currentChainNonce.Nonce is uint64; casting it to int64 overflows to a large negative value once the nonce exceeds math.MaxInt64. When that happens the comparison msg.ChainNonceHigh < int64(currentChainNonce.Nonce) is always false (the cast value is negative), so the guard silently allows any ChainNonceHigh — including one far below the real nonce — to succeed. A safer comparison avoids the cast entirely: check msg.ChainNonceHigh < 0 first (always a rollback), then compare as unsigned after the sign check.

Prompt To Fix With AI
This is a comment left during a code review.
Path: x/observer/keeper/msg_server_reset_chain_nonces.go
Line: 31

Comment:
**Unsafe `int64` cast of `uint64` nonce can flip sign and bypass rollback guard**

`currentChainNonce.Nonce` is `uint64`; casting it to `int64` overflows to a large negative value once the nonce exceeds `math.MaxInt64`. When that happens the comparison `msg.ChainNonceHigh < int64(currentChainNonce.Nonce)` is always false (the cast value is negative), so the guard silently allows any `ChainNonceHigh` — including one far below the real nonce — to succeed. A safer comparison avoids the cast entirely: check `msg.ChainNonceHigh < 0` first (always a rollback), then compare as unsigned after the sign check.

How can I resolve this? If you propose a fix, please make it concise.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
zetaclient/keys/relayer_key.go (1)

145-160: 🔒 Security & Privacy | 🟡 Minor | ⚡ Quick win

Zero fileData on every return path after read.

The current wipe is skipped when json.Unmarshal fails. Install the cleanup immediately after os.ReadFile succeeds.

Suggested fix
 	fileData, err := os.ReadFile(fileNameFull)
 	if err != nil {
 		return nil, errors.Wrapf(err, "unable to read relayer key data: %s", fileNameFull)
 	}
+	defer func() {
+		for i := range fileData {
+			fileData[i] = 0
+		}
+	}()
 
 	// unmarshal the JSON data into the struct
 	var key RelayerKey
 	err = json.Unmarshal(fileData, &key)
 	if err != nil {
 		return nil, errors.Wrap(err, "unable to unmarshal relayer key")
 	}
-
-	// zero out the file data buffer to reduce exposure of sensitive key material
-	for i := range fileData {
-		fileData[i] = 0
-	}
 
 	return &key, nil
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@zetaclient/keys/relayer_key.go` around lines 145 - 160, The wipe of fileData
in relayer_key.go is only happening after json.Unmarshal succeeds, so sensitive
key bytes can remain in memory on the error path. Move the zeroing cleanup to
run immediately after os.ReadFile succeeds in the relayer key loading flow, and
ensure it executes on every subsequent return path from the function that reads
and unmarshals RelayerKey data.
🧹 Nitpick comments (1)
pkg/crypto/aes256_gcm.go (1)

105-112: 🚀 Performance & Scalability | 🔵 Trivial | ⚡ Quick win

Reject structurally invalid ciphertext before PBKDF2.

Malformed short inputs currently trigger the 600k-iteration KDF before decryptWithKey returns ciphertext too short. Add a cheap precheck before deriving keys. As per path instructions, review Go code for performance.

Suggested direction
 func DecryptAES256GCM(ciphertext []byte, password string) ([]byte, error) {
+	if len(ciphertext) < 12 {
+		return nil, errors.New("ciphertext too short")
+	}
+
 	// try PBKDF2 key derivation first
 	plaintext, err := decryptWithKey(ciphertext, getAESKeyPBKDF2(password))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/crypto/aes256_gcm.go` around lines 105 - 112, DecryptAES256GCM should
reject structurally invalid ciphertext before doing any expensive key
derivation. Add a cheap upfront validation in DecryptAES256GCM, before the calls
to getAESKeyPBKDF2 and getAESKeyLegacy, so short/malformed inputs fail
immediately instead of running the KDFs. Keep the existing decryptWithKey flow
and legacy fallback unchanged for valid inputs, but ensure the new precheck
blocks obviously invalid ciphertext early.

Source: Path instructions

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/crypto/aes256_gcm.go`:
- Around line 30-33: The PBKDF2 key derivation in getAESKeyPBKDF2 currently uses
a fixed salt, so update the AES-GCM flow to generate and persist a random
per-ciphertext salt along with version/KDF metadata and use that metadata during
decryption for backward-compatible handling of legacy payloads. Adjust the
encrypt/decrypt paths in aes256_gcm.go so the salt is read from or written into
the ciphertext format, and move the PBKDF2 call to happen only after the
ciphertext length check so invalid inputs are rejected before any derivation
work.

In `@x/crosschain/types/inbound_params.go`:
- Around line 26-27: The shared hash check used by InboundParams.Validate()
still accepts any hex-decodable Ethereum/Zeta value, so tighten
ValidateHashForChain in validate.go to require a 32-byte decoded hash before
returning success. Update the helper to reject shorter/longer inputs like 0x01,
and keep the existing callers such as InboundParams.Validate() and the outbound
path relying on that centralized validation.

In `@x/fungible/keeper/msg_server_update_contract_bytecode.go`:
- Around line 67-70: Reject malformed NewCodeHash before it reaches state
updates, because MsgUpdateContractBytecode currently relies on
ethcommon.HexToHash in msg_server_update_contract_bytecode.go which will
silently pad or truncate invalid input. Add a strict 32-byte hex validation in
MsgUpdateContractBytecode.ValidateBasic so only properly formatted hashes are
accepted, and keep the SetAccount flow unchanged once the field is guaranteed
valid.

In `@x/observer/keeper/msg_server_reset_chain_nonces.go`:
- Around line 30-37: The rollback guard in ResetChainNonces is comparing mixed
signed/unsigned values, which can bypass the protection once the stored nonce is
above int64 range. Update the check in msg_server_reset_chain_nonces.go inside
the chain nonce handling to compare msg.ChainNonceHigh as uint64 against
currentChainNonce.Nonce after the existing non-negative validation, and keep the
ErrChainNonceRollback path unchanged.

In `@x/observer/keeper/observer_set.go`:
- Around line 69-70: The early-return guard in RemoveObserverFromSet changes the
observer-set behavior by preventing removal of the last observer, which breaks
the contract expected by downstream tests. Remove this len<=1 shortcut from
RemoveObserverFromSet and keep that method able to persist an empty set with
count 0; if you still need to protect validator membership, move that non-empty
safeguard into the validator-set-specific path such as the code that updates or
enforces validator membership rather than the observer removal flow.

In `@zetaclient/keys/relayer_key.go`:
- Around line 43-44: The Clear() contract for relayer_key.go is misleading
because it cannot zero out string-backed PrivateKey material. Update the key
handling in the RelayerKey type and its Clear() behavior so the decrypted secret
is stored in an owned []byte buffer that can actually be wiped, or revise the
comment/API to say it only drops references instead of claiming zeroization. Use
the RelayerKey and Clear() symbols to locate the affected logic and make the
memory-safety guarantee match the implementation.

---

Outside diff comments:
In `@zetaclient/keys/relayer_key.go`:
- Around line 145-160: The wipe of fileData in relayer_key.go is only happening
after json.Unmarshal succeeds, so sensitive key bytes can remain in memory on
the error path. Move the zeroing cleanup to run immediately after os.ReadFile
succeeds in the relayer key loading flow, and ensure it executes on every
subsequent return path from the function that reads and unmarshals RelayerKey
data.

---

Nitpick comments:
In `@pkg/crypto/aes256_gcm.go`:
- Around line 105-112: DecryptAES256GCM should reject structurally invalid
ciphertext before doing any expensive key derivation. Add a cheap upfront
validation in DecryptAES256GCM, before the calls to getAESKeyPBKDF2 and
getAESKeyLegacy, so short/malformed inputs fail immediately instead of running
the KDFs. Keep the existing decryptWithKey flow and legacy fallback unchanged
for valid inputs, but ensure the new precheck blocks obviously invalid
ciphertext early.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a9790857-3a5c-4283-a8d0-e8079dee4ff7

📥 Commits

Reviewing files that changed from the base of the PR and between 501fd44 and cd33b76.

📒 Files selected for processing (18)
  • pkg/crypto/aes256_gcm.go
  • x/crosschain/types/cctx.go
  • x/crosschain/types/inbound_params.go
  • x/crosschain/types/outbound_params.go
  • x/fungible/keeper/evm.go
  • x/fungible/keeper/gas_coin_and_pool.go
  • x/fungible/keeper/msg_server_deploy_fungible_coin_zrc20.go
  • x/fungible/keeper/msg_server_deploy_system_contract.go
  • x/fungible/keeper/msg_server_pause_zrc20.go
  • x/fungible/keeper/msg_server_unpause_zrc20.go
  • x/fungible/keeper/msg_server_update_contract_bytecode.go
  • x/fungible/keeper/msg_server_update_gateway_contract.go
  • x/fungible/types/message_deploy_fungible_coin_zrc20.go
  • x/observer/abci.go
  • x/observer/keeper/msg_server_reset_chain_nonces.go
  • x/observer/keeper/observer_set.go
  • x/observer/types/errors.go
  • zetaclient/keys/relayer_key.go
💤 Files with no reviewable changes (1)
  • x/observer/abci.go

Comment thread pkg/crypto/aes256_gcm.go
Comment on lines +30 to +33
// getAESKeyPBKDF2 derives a key using PBKDF2-HMAC-SHA256 with a static salt.
func getAESKeyPBKDF2(password string) []byte {
salt := []byte("zeta-chain-aes-256-gcm")
return pbkdf2.Key([]byte(password), salt, pbkdf2Iterations, pbkdf2KeyLen, sha256.New)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail
echo '--- file ---'
wc -l pkg/crypto/aes256_gcm.go
echo '--- excerpt ---'
cat -n pkg/crypto/aes256_gcm.go | sed -n '1,220p'
echo '--- search call sites ---'
rg -n "EncryptAES|DecryptAES|getAESKeyPBKDF2|pbkdf2Iterations|pbkdf2KeyLen" . -S

Repository: zeta-chain/node

Length of output: 8851


Use a per-ciphertext salt for PBKDF2.

The fixed salt makes every password derive the same AES key, so the KDF no longer provides per-message uniqueness. Store a random salt with each ciphertext and include version/KDF metadata so legacy payloads can still be decrypted. Also defer PBKDF2 until after the ciphertext length check to avoid unnecessary work on obviously invalid input.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/crypto/aes256_gcm.go` around lines 30 - 33, The PBKDF2 key derivation in
getAESKeyPBKDF2 currently uses a fixed salt, so update the AES-GCM flow to
generate and persist a random per-ciphertext salt along with version/KDF
metadata and use that metadata during decryption for backward-compatible
handling of legacy payloads. Adjust the encrypt/decrypt paths in aes256_gcm.go
so the salt is read from or written into the ciphertext format, and move the
PBKDF2 call to happen only after the ciphertext length check so invalid inputs
are rejected before any derivation work.

Comment on lines +26 to +27
if err := ValidateHashForChain(m.ObservedHash, m.SenderChainId); err != nil {
return errors.Wrap(err, "invalid inbound tx observed hash")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Tighten Ethereum/Zeta hash validation to a fixed hash length.

Line 26 now relies on ValidateHashForChain, but x/crosschain/types/validate.go:26-35 currently accepts any hex-decodable Ethereum/Zeta value. Inputs like 0x01 would pass here even though they are not transaction hashes, so this hardening still leaves malformed hashes admissible. Please enforce a 32-byte decoded length in the shared helper before relying on it from InboundParams.Validate() (this will also cover the outbound path).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@x/crosschain/types/inbound_params.go` around lines 26 - 27, The shared hash
check used by InboundParams.Validate() still accepts any hex-decodable
Ethereum/Zeta value, so tighten ValidateHashForChain in validate.go to require a
32-byte decoded hash before returning success. Update the helper to reject
shorter/longer inputs like 0x01, and keep the existing callers such as
InboundParams.Validate() and the outbound path relying on that centralized
validation.

Comment on lines +67 to +70
tmpCtx, commit := ctx.CacheContext()
oldCodeHash := acct.CodeHash
acct.CodeHash = ethcommon.HexToHash(msg.NewCodeHash).Bytes()
err = k.evmKeeper.SetAccount(ctx, contractAddress, *acct)
err = k.evmKeeper.SetAccount(tmpCtx, contractAddress, *acct)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Inspect validation for MsgUpdateContractBytecode.NewCodeHash.

rg -n -C 6 'MsgUpdateContractBytecode|NewCodeHash|ValidateBasic' .

Repository: zeta-chain/node

Length of output: 50372


Reject malformed NewCodeHash before conversion. ethcommon.HexToHash pads/truncates input, so a short or overlong hash can still be written into account state as a different 32-byte value. Add a strict 32-byte hex check in MsgUpdateContractBytecode.ValidateBasic.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@x/fungible/keeper/msg_server_update_contract_bytecode.go` around lines 67 -
70, Reject malformed NewCodeHash before it reaches state updates, because
MsgUpdateContractBytecode currently relies on ethcommon.HexToHash in
msg_server_update_contract_bytecode.go which will silently pad or truncate
invalid input. Add a strict 32-byte hex validation in
MsgUpdateContractBytecode.ValidateBasic so only properly formatted hashes are
accepted, and keep the SetAccount flow unchanged once the field is guaranteed
valid.

Comment on lines +30 to +37
currentChainNonce, chainNonceFound := k.GetChainNonces(ctx, msg.ChainId)
if chainNonceFound && msg.ChainNonceHigh < int64(currentChainNonce.Nonce) {
return nil, errors.Wrapf(
types.ErrChainNonceRollback,
"cannot roll back chain nonce from %d to %d",
currentChainNonce.Nonce,
msg.ChainNonceHigh,
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Inspect nonce field types and writers/readers"
rg -n -C2 'type ChainNonces|Nonce: uint64|ChainNonceHigh|SetChainNonces\(|GetChainNonces\(' x/observer

echo
echo "Inspect the exact ResetChainNonces implementation"
sed -n '14,60p' x/observer/keeper/msg_server_reset_chain_nonces.go

Repository: zeta-chain/node

Length of output: 19457


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Search every ChainNonces write path"
rg -n -C2 'Nonce:\s*|SetChainNonces\(' x/observer --glob '!**/*.pb.go'

echo
echo "Inspect ChainNonces definition"
sed -n '1,120p' x/observer/types/chain_nonces.pb.go

echo
echo "Inspect sample generators used in tests and genesis"
sed -n '1,220p' x/observer/types/sample/chain_nonces.go
sed -n '1,220p' x/observer/types/sample/genesis.go
sed -n '1,220p' x/observer/genesis.go

echo
echo "Inspect any resets or updates that can raise nonce values"
sed -n '1,220p' x/observer/keeper/pending_nonces.go
sed -n '1,220p' x/observer/keeper/msg_server_reset_chain_nonces.go

Repository: zeta-chain/node

Length of output: 10086


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Locate sample and genesis helpers"
fd -a 'chain_nonces.go|genesis.go' x/observer/types x/observer

echo
echo "Inspect genesis state wiring for chain nonces"
sed -n '1,220p' x/observer/types/genesis.go
sed -n '1,220p' x/observer/types/genesis.pb.go
sed -n '1,220p' x/observer/genesis.go

echo
echo "Inspect any sample generator for ChainNonces values"
fd -a 'sample.go|*.go' x/observer/types/sample x/observer/sample x/observer
rg -n 'ChainNoncesList|Nonce:\s*rand|Nonce:\s*uint64|Nonce:\s*0' x/observer/types x/observer --glob '!**/*.pb.go'

Repository: zeta-chain/node

Length of output: 17007


Compare the rollback guard in the unsigned domain.

currentChainNonce.Nonce is uint64, but the guard casts it to int64. That can let a lower ChainNonceHigh slip through once the stored nonce exceeds math.MaxInt64, overwriting persisted state. After the existing non-negative check, compare uint64(msg.ChainNonceHigh) against currentChainNonce.Nonce instead.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@x/observer/keeper/msg_server_reset_chain_nonces.go` around lines 30 - 37, The
rollback guard in ResetChainNonces is comparing mixed signed/unsigned values,
which can bypass the protection once the stored nonce is above int64 range.
Update the check in msg_server_reset_chain_nonces.go inside the chain nonce
handling to compare msg.ChainNonceHigh as uint64 against currentChainNonce.Nonce
after the existing non-negative validation, and keep the ErrChainNonceRollback
path unchanged.

Comment on lines +69 to +70
if observerSet.LenUint() <= 1 {
return observerSet.LenUint()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | 🏗️ Heavy lift

This guard changes the observer-set contract, not just validator-set safety.

With this early return, removing the sole observer now becomes a no-op that returns 1. That conflicts with the supplied downstream expectation in x/observer/keeper/observer_set_test.go:147-163, which removes the last observer and expects an empty persisted set with count 0, and it is broader than the PR objective’s validator-set safeguard. If the intent is only to keep validator membership non-empty, this check belongs in that path rather than RemoveObserverFromSet.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@x/observer/keeper/observer_set.go` around lines 69 - 70, The early-return
guard in RemoveObserverFromSet changes the observer-set behavior by preventing
removal of the last observer, which breaks the contract expected by downstream
tests. Remove this len<=1 shortcut from RemoveObserverFromSet and keep that
method able to persist an empty set with count 0; if you still need to protect
validator membership, move that non-empty safeguard into the
validator-set-specific path such as the code that updates or enforces validator
membership rather than the observer removal flow.

Comment on lines +43 to +44
// The caller should call Clear() on the returned key when it is no longer needed
// to zero out the private key material from memory.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Map the file before reading targeted slices.
ast-grep outline zetaclient/keys/relayer_key.go --view expanded || true

echo '--- file with line numbers (1-260) ---'
cat -n zetaclient/keys/relayer_key.go | sed -n '1,260p'

Repository: zeta-chain/node

Length of output: 8375


🏁 Script executed:

python3 - <<'PY'
import subprocess, textwrap, tempfile, os, sys

src = r'''
package main
import (
  "fmt"
  "unsafe"
)
func main() {
  s := "secret"
  b := []byte(s)
  fmt.Printf("len=%d same_ptr=%v before=%q\n", len(b), unsafe.StringData(s) == &b[0], b)
  for i := range b { b[i] = 0 }
  fmt.Printf("after=%q string=%q\n", b, s)
}
'''
with tempfile.TemporaryDirectory() as d:
    path = os.path.join(d, "main.go")
    with open(path, "w") as f:
        f.write(src)
    out = subprocess.check_output(["go", "run", path], stderr=subprocess.STDOUT, text=True)
    print(out)
PY

Repository: zeta-chain/node

Length of output: 240


Clear() cannot zero string-backed key material
[]byte(s) allocates a copy, so this loop clears only the temporary slice; the original PrivateKey string bytes remain until garbage collected. Store decrypted key material in an owned []byte and clear that buffer, or change the contract to “drop the reference” instead of claiming zeroization.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@zetaclient/keys/relayer_key.go` around lines 43 - 44, The Clear() contract
for relayer_key.go is misleading because it cannot zero out string-backed
PrivateKey material. Update the key handling in the RelayerKey type and its
Clear() behavior so the decrypted secret is stored in an owned []byte buffer
that can actually be wiped, or revise the comment/API to say it only drops
references instead of claiming zeroization. Use the RelayerKey and Clear()
symbols to locate the affected logic and make the memory-safety guarantee match
the implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant