Lesson 3
Reading Files Character-by-Character with InputStream
Introduction to InputStream

Welcome to this lesson on reading files character by character. Building upon our foundational knowledge of handling text files, this lesson will guide you through utilizing input streams to read files character by character. These methods are crucial for handling varying file sizes efficiently and managing memory use effectively. By the end of this lesson, you'll be able to read entire files, specific portions, and process files in manageable chunks, granting you flexible control over text data processing.

Example File

Before we jump into reading files, let's review the example file that we will work with:

Plain text
1Hi! 2This file contains some sample example text to use to test how the read method works. 3Let's do some programming!

This file contains multiple lines of varied lengths.

Understanding InputStreams

The InputStream is designed for reading data from a source in the form of bytes, and it can be used for reading files as a stream of characters. It provides methods to open a file and read its contents character by character, which is helpful for detailed text processing. To read a file's entire content character by character, we open the file using an input stream and utilize the read() method, which reads the next byte in the stream and returns its integer representation.

Here is how to read a file completely character by character using a loop:

Scala
1import os._ 2 3@main def main() = 4 // Specify the file path 5 val filePath = os.pwd / "example.txt" 6 7 println("Reading file character-by-character:") 8 9 // Open the file and get an input stream 10 val inputStream = os.read.inputStream(filePath) 11 12 // Read the first byte from the input stream 13 var byte = inputStream.read() 14 15 // Continue reading until the end of the file is reached 16 while (byte != -1) { 17 // Convert the byte to a character and print it 18 print(byte.toChar) 19 // Read the next byte 20 byte = inputStream.read() 21 } 22 23 // Close the input stream after processing 24 inputStream.close()

The read() method reads a byte from the file and returns its integer representation, which is then cast back to a character using toChar. The loop continues until all characters have been read. It is crucial to close the input stream after the operations to release system resources and prevent potential file locks.

The expected output is:

Plain text
1Hi! 2This file contains some sample example text to use to test how the read method works. 3Let's do some programming!

This approach suits situations where processing files one character at a time is necessary or preferred. It provides a straightforward way to read entire files while maintaining resource efficiency, making input streams a versatile option for file manipulation.

Reading Defined Portions of a File

In many situations, you may only need to read a specific number of characters rather than the entire file. This can be accomplished by utilizing read() inside a loop that iterates a set number of times based on the count of characters you'd like to read.

Scala
1import os._ 2 3@main def main() = 4 // Specify the file path 5 val filePath = os.pwd / "example.txt" 6 7 println("Reading first 10 characters:") 8 9 // Open the file and get an input stream 10 val inputStream = os.read.inputStream(filePath) 11 12 // Number of characters to read 13 var charactersToRead = 10 14 15 // Read the first byte from the input stream 16 var byte = inputStream.read() 17 18 // Continue reading until the specified number is reached or end-of-file 19 while (byte != -1 && charactersToRead > 0) { 20 // Convert the byte to a character and print it 21 print(byte.toChar) 22 // Read the next byte 23 byte = inputStream.read() 24 // Decrement the charactersToRead count 25 charactersToRead -= 1 26 } 27 28 // Close the input stream after processing 29 inputStream.close()

The loop continues until the specified number of characters (in this case, 10) is read or until the end of the file is reached, whichever comes first.

The expected output is:

Plain text
1Hi! 2This

This method is especially useful for preliminary processing or debugging large files.

Sequential File Reading

Sequential reading in file handling refers to the process of reading a file's contents in a predefined, orderly manner — part by part — without restarting from the beginning each time. For instance, by using loops in combination with the read() method, each subsequent read operation continues from where the last one concluded.

This approach is particularly beneficial when dealing with large files or when you require distinct segments from a file. Unlike loading the entire file into memory at once, which can be inefficient and resource-intensive for large files, sequential reading processes data in smaller, manageable segments, thus conserving memory and allowing the application to maintain performance and responsiveness.

Here's an example:

Scala
1import os._ 2 3@main def main() = 4 // Specify the file path 5 val filePath = os.pwd / "example.txt" 6 7 // Open the file and get an input stream 8 val inputStream = os.read.inputStream(filePath) 9 10 println("First 10 characters:") 11 12 // First segment: read the initial 10 characters 13 var firstReadLength = 10 14 var byte = inputStream.read() 15 while (byte != -1 && firstReadLength > 0) { 16 print(byte.toChar) 17 byte = inputStream.read() 18 firstReadLength -= 1 19 } 20 21 println("\n\nNext 10 characters:") 22 23 // Second segment: read the next 10 characters 24 var nextReadLength = 10 25 byte = inputStream.read() 26 while (byte != -1 && nextReadLength > 0) { 27 print(byte.toChar) 28 byte = inputStream.read() 29 nextReadLength -= 1 30 } 31 32 // Close the input stream after processing 33 inputStream.close()

This program reads and prints the first 10 characters from the file, then moves forward to read the next set of 10 characters. The input stream maintains its position within the file, allowing seamless sequential access to subsequent portions.

Expected output would be:

Plain text
1First 10 characters: 2Hi! 3This f 4 5Next 10 characters: 6ile contai

This technique is advantageous when you only need specific sections of a file or when you want to process large files in smaller chunks, thereby reducing memory consumption and improving performance. It facilitates better data manipulation in scenarios where progressive reading is required.

Summary and Next Steps

In this lesson, you learned how to effectively use input streams for various file reading tasks. We covered reading entire files, specific portions, and processing large files efficiently by chunk reading. These techniques provide you with control and flexibility in handling text data.

Experiment with these methods using different files to strengthen your understanding of file manipulation. Keep practicing to enhance your skills in file handling and data processing. You have made substantial progress in mastering file manipulation.

Enjoy this lesson? Now it's time to practice with Cosmo!
Practice is how you turn knowledge into actual skills.