Add data range interface to give access underlying data#30
Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces a public DataRange struct and a DataRange() []DataRange capability (via type assertion) to expose where uncompressed file bytes live in underlying backing devices, and to let Writer.CopyFrom(..., MetadataOnly()) synthesize chunk mappings without requiring callers to construct internal chunk/index types.
Changes:
- Add exported
DataRangetype and plumbDataRange()through EROFSfs.FileInfo(fileInfo) for regular files. - Teach
Writer.CopyFrom(metadata-only path) to derivebuilder.Chunkmappings fromDataRange()when no chunks are already present. - Add
chunksFromRangeshelper to convertDataRangeentries into internal chunk entries.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| mkfs.go | Uses DataRange() in metadata-only CopyFrom to build chunk mappings; adds conversion helper. |
| erofs.go | Defines exported DataRange and computes/populates per-file data ranges in Stat() results. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
94513e5 to
bfa2e05
Compare
Access to underlying data may have multiple uses, from directly acessing file data from a backing store, detecting sparse areas in a reader, and efficiently building with metadata only. Signed-off-by: Derek McGowan <derek@mcg.dev>
| switch ino.inodeLayout { | ||
| case disk.LayoutFlatPlain: | ||
| dataOffset := int64(ino.inodeData) << b.img.sb.BlkSizeBits | ||
| return []DataRange{{Device: 0, Offset: dataOffset, Size: ino.size}} |
There was a problem hiding this comment.
for flat layouts, they're not always device: 0 here:
dataOffset should be fixed against mapped_blkaddr, see mapDev
| var ranges []DataRange | ||
| if headSize > 0 { | ||
| dataOffset := int64(ino.inodeData) << b.img.sb.BlkSizeBits | ||
| ranges = append(ranges, DataRange{Device: 0, Offset: dataOffset, Size: headSize}) |
| dataOffset := int64(ino.inodeData) << b.img.sb.BlkSizeBits | ||
| ranges = append(ranges, DataRange{Device: 0, Offset: dataOffset, Size: headSize}) | ||
| } | ||
| ranges = append(ranges, DataRange{Device: 0, Offset: trailingAddr, Size: tailSize}) |
There was a problem hiding this comment.
for trailing data, device = 0 is correct.
hsiangkao
left a comment
There was a problem hiding this comment.
maybe just merge this, and I will cleanup all later.
Access to underlying data may have multiple uses, from directly acessing file data from a backing store, detecting sparse areas in a reader, and efficiently building with metadata only.
This is needed to implement metadata only mode for tar and ext4 outside of this repository without exporting internal data structures. The data range struct is simple and can easily be utilized in an interface without exposing internal structures.