Taking a Closer Look at io.SectionReader
In the comprehensive collection of Go’s built-in packages, the io
package holds a special place with its wide-ranging functionality. Today, we’ll be zooming in on a specific feature within this package the io.SectionReader
type.
Here is the myfile.txt file that is being used throughout the article.
myfile.txt
1
Hello, world! This is an example of using SectionReader in Go
What is SectionReader?
The io.SectionReader
type in Go constructs a new reader that allows reading from a specific section of an underlying data stream, such as a file. You can envision it as a “window” into a data stream, where you have defined the starting and ending points of the view.
The structure is as follows:
1
2
3
4
5
6
7
8
// SectionReader implements Read, Seek, and ReadAt on a section
// of an underlying ReaderAt.
type SectionReader struct {
r ReaderAt
base int64
off int64
limit int64
}
And it’s constructor function is:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
// NewSectionReader returns a SectionReader that reads from r
// starting at offset off and stops with EOF after n bytes.
func NewSectionReader(r ReaderAt, off, n int64) *SectionReader {
var remaining int64
const maxint64 = 1<<63 - 1
if off <= maxint64-n {
remaining = n + off
} else {
// Overflow, with no way to return error.
// Assume we can read up to an offset of 1<<63 - 1.
remaining = maxint64
}
return &SectionReader{r, off, off, remaining}
}
This function takes a ReaderAt
, an offset and a number of bytes as arguments and returns a *SectionReader
. The returned SectionReader
behaves exactly like the original ReaderAt
with the added behavior of only allowing to read from a specific section of the underlying data defined by the provided offset and number of bytes.
The function also checks for possible overflow to ensure the offset plus the number of bytes does not exceed the maximum value for an int64, providing a safeguard
against potential errors.
Why Use SectionReader?
SectionReader offers significant benefits when dealing with large files or data streams. By focusing on only a specific section of data, you can save memory and CPU resources by not reading the entire file into memory. This becomes especially important when working with massive files that can’t comfortably fit into memory.
Additionally, SectionReader can be used to easily access and manipulate certain regions of a file without having to navigate manually. For instance, parsing file headers, reading metadata from a certain section of a file, or accessing a certain ‘chunk’ from a large data stream.
How Does SectionReader Work?
The SectionReader
type in Go’s io
package returns a structure called a SectionReader
. This structure contains four key components:
r
: This is the underlying reader from which data is read.base
: This is the starting offset from the beginning of the underlying reader.off
: This is the current offset from the beginning of the underlying reader.limit
: This is the maximum offset in the underlying reader up to where data can be read.
1
2
3
4
5
6
type SectionReader struct {
r ReaderAt
base int64
off int64
limit int64
}
The SectionReader
struct has a few methods, including Read
, Seek
, and ReadAt
that operate on the section of the underlying reader r
between base
and limit
.
Read
1
2
3
4
5
6
7
8
9
10
11
func (s *SectionReader) Read(p []byte) (n int, err error) {
if s.off >= s.limit {
return 0, EOF
}
if max := s.limit - s.off; int64(len(p)) > max {
p = p[:max]
}
n, err = s.r.ReadAt(p, s.off)
s.off += int64(n)
return
}
The Read
method is primarily responsible for reading data from the specified section. From the developer’s perspective, it offers convenience by automatically managing the offset and avoiding reading beyond the defined section.
When you use Read
, you don’t need to worry about manually increasing the offset or checking if you’ve read the same bytes more than once. All you need to do is to provide a byte slice that Read
fills with data. If you continue reading with the same SectionReader, it will continue from where it last left off, and once it has read to the limit of the section, it will return an EOF error.
Seek
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
var errWhence = errors.New("Seek: invalid whence")
var errOffset = errors.New("Seek: invalid offset")
func (s *SectionReader) Seek(offset int64, whence int) (int64, error) {
switch whence {
default:
return 0, errWhence
case SeekStart:
offset += s.base
case SeekCurrent:
offset += s.off
case SeekEnd:
offset += s.limit
}
if offset < s.base {
return 0, errOffset
}
s.off = offset
return offset - s.base, nil
}
The Seek
function lets the developer move the current offset to a desired location within the section. This is useful when you want to skip to a particular part of the section without having to read through everything before it.
Depending on the whence
parameter, you can set the offset relative to the start of the section, the current offset, or the end of the section. Seek
keeps you within the bounds of the section, returning an error if you try to seek before the beginning of the section.
ReadAt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
func (s *SectionReader) ReadAt(p []byte, off int64) (n int, err error) {
if off < 0 || off >= s.limit-s.base {
return 0, EOF
}
off += s.base
if max := s.limit - off; int64(len(p)) > max {
p = p[:max]
n, err = s.r.ReadAt(p, off)
if err == nil {
err = EOF
}
return n, err
}
return s.r.ReadAt(p, off)
}
The ReadAt
function provides more explicit control compared to Read
. It reads from the underlying reader at a specific offset, without changing the current offset of the SectionReader. This can be useful when you need to read specific parts of the section multiple times or in a non-linear order.
The ReadAt
function, like Read
, also respects the limits of the section and ensures you don’t read beyond it.
Size
1
2
// Size returns the size of the section in bytes.
func (s *SectionReader) Size() int64 { return s.limit - s.base }
The Size
function provides a simple way to get the size of the section in bytes. This can be useful when you need to know the length of the section for calculations or for determining the size of the byte slice to be read.
Using SectionReader
Read
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
file, err := os.Open("myfile.txt")
if err != nil {
log.Fatal(err)
}
reader := io.NewSectionReader(file, 5, 100)
firstChunk := make([]byte, 10)
_, err = reader.Read(firstChunk)
if err != nil {
log.Fatal(err)
}
fmt.Println(string(firstChunk))
secondChunk := make([]byte, 15)
_, err = reader.Read(secondChunk)
if err != nil {
log.Fatal(err)
}
fmt.Println(string(secondChunk))
lastChunk, err := io.ReadAll(reader)
if err != nil {
log.Fatal(err)
}
fmt.Println(string(lastChunk))
If the text in “myfile.txt” was “Hello, world! This is an example of using SectionReader in Go”, the output of this program would be:
1
2
3
, world! T
his is an examp
le of using SectionReader in Go
Here, we used Read
function to read a section of the file starting from the 5th byte and stopping after 100 bytes. We read the first 10 bytes, then the next 15 bytes, and finally read the rest of the section. Read
managed the offset for us, allowing us to focus on processing the data.
Seek
1
2
3
4
5
6
7
8
9
10
11
12
13
14
reader := io.NewSectionReader(file, 5, 100)
_, err = reader.Seek(20, io.SeekStart)
if err != nil {
log.Fatal(err)
}
chunk := make([]byte, 15)
_, err = reader.Read(chunk)
if err != nil {
log.Fatal(err)
}
fmt.Println(string(chunk))
This is output:
1
example of usin
We moved the offset to the 20th byte from the start of the section, and then read 15 bytes from there. So the output is the string starting at the 25th byte of the file (5 bytes of initial offset + 20 bytes from Seek
) and is 15 bytes long.
ReadAt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
reader := io.NewSectionReader(file, 5, 100)
chunk := make([]byte, 10)
_, err = reader.ReadAt(chunk, 20)
if err != nil {
log.Fatal(err)
}
fmt.Println(string(chunk))
_, err = reader.Read(chunk)
if err != nil {
log.Fatal(err)
}
fmt.Println(string(chunk))
This is output:
1
2
example of
, world! T
We used ReadAt
to read 10 bytes at the 20th byte of the section (25th byte of the file), and then used Read
to read the first 10 bytes of the section again. ReadAt
didn’t change the offset, so Read
started from the beginning of the section.
Size
1
2
reader := io.NewSectionReader(file, 5, 100)
fmt.Println(reader.Size())
This is output:
1
100
Conclusion
The io.SectionReader
in Go is a powerful tool for reading specific sections of a file or data stream. It provides a way to read, seek, and read at a specific position in the section, making it a versatile tool for handling file operations. Whether you’re dealing with large files or need to process specific sections of data, the SectionReader
can be a great asset in your Go programming toolkit. Its efficient design and easy-to-use interface make it a go-to choice for many developers.
Happy coding!