feat(Probability): add Paley-Zygmund inequality#40835
Conversation
Welcome new contributor!Thank you for contributing to Mathlib! If you haven't done so already, please review our contribution guidelines, as well as the style guide and naming conventions. In particular, we kindly remind contributors that we have guidelines regarding the use of AI when making pull requests. We use a review queue to manage reviews. If your PR does not appear there, it is probably because it is not successfully building (i.e., it doesn't have a green checkmark), has the If you haven't already done so, please come to https://leanprover.zulipchat.com/, introduce yourself, and mention your new PR. Thank you again for joining our community. |
PR summary ac7311c75bImport changes for modified filesNo significant changes to the import graph Import changes for all files
|
✅ PR Title Formatted CorrectlyThe title of this PR has been updated to match our commit style conventions. |
There was a problem hiding this comment.
The comments below are mostly about code styles and you can learn about them over here https://leanprover-community.github.io/contribute/style.html. However, some of them are probably not necessary as I have some suggestions about the overall design that can lead to some large changes for your PR.
- There's nothing special about 2 here. You should prove it for any
p : ℝ≥0∞such that1 <p<∞. It is possible that some version of this inequality is true for the endpoints1and∞, but I didn't check them now. - I think it is better to first prove the following version:
(∫ ω, Z ω ∂μ - a) ^ 2 ≤ (∫ ω, Z ω ^ 2 ∂μ) * μ.real {ω | a < Z ω}for any0≤a≤∫ ω, Z ω ∂μ(for simplicity I used 2 here but you should use a generalpas I mentioned above). And then you can prove the inequality you want as a special case ofa = θ * ∫ ω, Z ω ∂μ. You can argue that in some sense they are equivalent, but I think this version is more widely applicable. - In fact, the theorem above is not really optimal. You can even prove that
(∫ ω, Z ω ∂μ - a) ^ 2 ≤ (∫ ω, Z ω ^ 2 ∂μ) * μ.real {ω | a < Z ω}is true for any(μ.real {ω | 0 < Z ω})⁻¹ * 0≤a≤∫ ω, Z ω ∂μ. This gives the following better inequality that you can find on wikipedia:
- Let's go even further. Is it really necessary to assume that
Zis real valued? The main theorems you used includesetIntegral_mono_on₀andConvexOn.map_set_average_le, and these should work for any random variables taking values in a spaceEsuch that the following holds:
[NormedAddCommGroup E] [NormedSpace ℝ E] [CompleteSpace E] [PartialOrder E] [IsOrderedAddMonoid E] [IsOrderedModule ℝ E] [HasSolidNorm]. Of course, you also need to adapt the theorem that you want to prove a little bit:
‖∫ ω, Z ω ∂μ - a‖ ^ 2 ≤ ∫ ω, ‖Z ω‖ ^ 2 ∂μ * μ.real {ω | a < Z ω} is true for any 0≤a≤ (μ.real {ω | 0 < Z ω})⁻¹ • ∫ ω, Z ω ∂μ
You can then include the real version as an easy corollary of this theorem
| public import Mathlib.Probability.Moments.Variance | ||
| public import Mathlib.MeasureTheory.Integral.Bochner.Set | ||
| public import Mathlib.Analysis.Convex.Integral | ||
| public import Mathlib.Analysis.Convex.Mul |
There was a problem hiding this comment.
| public import Mathlib.Probability.Moments.Variance | |
| public import Mathlib.MeasureTheory.Integral.Bochner.Set | |
| public import Mathlib.Analysis.Convex.Integral | |
| public import Mathlib.Analysis.Convex.Mul | |
| public import Mathlib.MeasureTheory.Function.L2Space | |
| import Mathlib.Analysis.Convex.Integral |
This is enough. Adding #min_imports at the bottom of a file can help you figure out the minimum import needed. You can check out this thread to learn when should one use public import: https://leanprover.zulipchat.com/#narrow/channel/113488-general/topic/module.20system.20questions
| {θ : ℝ} (hθ0 : 0 ≤ θ) (hθ1 : θ ≤ 1) : (1 - θ) ^ 2 * (∫ ω, Z ω ∂μ) ^ 2 ≤ (∫ ω, Z ω ^ 2 ∂μ) | ||
| * (μ {ω | θ * ∫ ω', Z ω' ∂μ < Z ω}).toReal := by |
There was a problem hiding this comment.
| {θ : ℝ} (hθ0 : 0 ≤ θ) (hθ1 : θ ≤ 1) : (1 - θ) ^ 2 * (∫ ω, Z ω ∂μ) ^ 2 ≤ (∫ ω, Z ω ^ 2 ∂μ) | |
| * (μ {ω | θ * ∫ ω', Z ω' ∂μ < Z ω}).toReal := by | |
| {θ : ℝ} (hθ0 : 0 ≤ θ) (hθ1 : θ ≤ 1) : | |
| (1 - θ) ^ 2 * (∫ ω, Z ω ∂μ) ^ 2 ≤ | |
| (∫ ω, Z ω ^ 2 ∂μ) * μ.real {ω | θ * ∫ ω, Z ω ∂μ < Z ω} := by |
Need an extra indent here, and I think it is better to be consistent with the variable used inside the integral. Measure.real is more commonly used in the library. Always put an infix operator before the line break.
| apply mul_le_of_le_one_left (mul_nonneg hθ0 hZ_int_nn) | ||
| measureReal_le_one |
There was a problem hiding this comment.
| apply mul_le_of_le_one_left (mul_nonneg hθ0 hZ_int_nn) | |
| measureReal_le_one | |
| apply mul_le_of_le_one_left (mul_nonneg hθ0 hZ_int_nn) measureReal_le_one |
No need to have a line break as it is short enough.
| have h_cs: (∫ ω in S, Z ω ∂μ) ^ 2 ≤ | ||
| (∫ ω, Z ω ^ 2 ∂μ ) * (μ S).toReal:= by | ||
| by_cases hS : μ S = 0 | ||
| · simp [Measure.restrict_eq_zero.mpr, hS] | ||
| · have h_jensen := ConvexOn.map_set_average_le | ||
| even_two.convexOn_pow | ||
| (continuous_pow 2).continuousOn | ||
| isClosed_univ | ||
| hS | ||
| (measure_ne_top μ S) | ||
| (by simp) | ||
| hZ_int.integrableOn | ||
| hZ2.integrable_sq.integrableOn |
There was a problem hiding this comment.
Similarly there are two many unnecessary line breaks here.
| _ ≤ (∫ ω in S, Z ω ∂μ) ^ 2 := by | ||
| apply pow_le_pow_left₀ (mul_nonneg (sub_nonneg.mpr hθ1) hZ_int_nn) h_lower |
There was a problem hiding this comment.
| _ ≤ (∫ ω in S, Z ω ∂μ) ^ 2 := by | |
| apply pow_le_pow_left₀ (mul_nonneg (sub_nonneg.mpr hθ1) hZ_int_nn) h_lower | |
| _ ≤ (∫ ω in S, Z ω ∂μ) ^ 2 := | |
| pow_le_pow_left₀ (mul_nonneg (sub_nonneg.mpr hθ1) hZ_int_nn) h_lower 2 |
by exact or by apply is not prefered if you can formalize with a single proof term.
| rw [setAverage_eq, setAverage_eq, smul_eq_mul, smul_eq_mul, mul_pow, sq ((μ.real S)⁻¹), | ||
| mul_assoc, mul_le_mul_iff_of_pos_left (inv_pos.mpr hμS_pos), mul_comm, | ||
| ← div_eq_mul_inv, div_le_iff₀ hμS_pos, measureReal_def] at h_jensen |
There was a problem hiding this comment.
| rw [setAverage_eq, setAverage_eq, smul_eq_mul, smul_eq_mul, mul_pow, sq ((μ.real S)⁻¹), | |
| mul_assoc, mul_le_mul_iff_of_pos_left (inv_pos.mpr hμS_pos), mul_comm, | |
| ← div_eq_mul_inv, div_le_iff₀ hμS_pos, measureReal_def] at h_jensen | |
| rw [setAverage_eq, setAverage_eq, smul_eq_mul, smul_eq_mul, mul_pow, sq ((μ.real S)⁻¹), | |
| mul_assoc, mul_le_mul_iff_of_pos_left (inv_pos.mpr hμS_pos), mul_comm, | |
| ← div_eq_mul_inv, div_le_iff₀ hμS_pos, measureReal_def] at h_jensen |
| mul_le_mul_of_nonneg_right | ||
| (setIntegral_le_integral hZ2.integrable_sq (ae_of_all μ (fun x => sq_nonneg (Z x)))) | ||
| ENNReal.toReal_nonneg |
There was a problem hiding this comment.
| mul_le_mul_of_nonneg_right | |
| (setIntegral_le_integral hZ2.integrable_sq (ae_of_all μ (fun x => sq_nonneg (Z x)))) | |
| ENNReal.toReal_nonneg | |
| gcongr ?_ * ?_ | |
| exact setIntegral_le_integral hZ2.integrable_sq (ae_of_all μ (fun x => sq_nonneg (Z x))) |
gcongr saves you effort for find theorems like mul_le_mul_of_nonneg_right
|
Hello, can you disclose if and how you used LLMs in the creation of this PR, per the contribution guidelines? |
Hi, I just update my PR description. Thanks for the reminder! |
Add the Paley-Zygmund inequality: for a nonneg random variable Z with finite variance and 0 ≤ θ ≤ 1,
(1-θ)² E[Z]² ≤ E[Z²] · P(Z > θ E[Z]).
The proof uses Jensen's inequality applied to x² on the set {Z > θ E[Z]}.
AI disclosure: I used Claude Code as a learning aid while writing this proof. It helped me find the right Mathlib lemma names and understand tactic syntax, but I wrote every line of code myself.